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SPECIFICATION 
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Computer-assisted Means For Assessing Lifestyle Risk Factors 
Field of the Invention 

5 

The present invention relates to methods of assessing disease 
susceptibility. In particular, it relates to methods of 
assessing disease susceptibility associated with dietary and 
lifestyle risk factors. 

10 

Background to the Invention 

Cancer is a disease influenced primarily by external factors. 
Up to 80% of human cancers arise from exposure to 
15 environmental agents. The majority of cancer is believed to 
be preventable because exposure to these external factors 
should be manageable (Giovannucci , 1999; Perera, 2000) . 

Human tumours result from a series of mutational events, 
20 leading to the loss of the regulatory mechanisms that govern 
normal cell behaviour and ultimately resulting in the 
formation of a tumour with full metastatic (or invasive) 
potential (Smith, 1995) . All higher organisms have developed 
a complex variety of mechanisms to protect themselves from 
25 environmental insult, for example from ingested plant toxins. 
One of the most important protection measures involves the 
metabolism of toxins (or xenobiotics) leading to 
detoxification and ultimately excretion of the toxin (Smith, 
1995) . Unfortunately, the metabolic pathways do not always 
30 lead to detoxification of the toxin. Indeed many chemical 

carcinogens are activated by these same metabolic pathways to 
react with cellular macromolecules . 
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Improvements in genetic analysis and the availability of 
human genetic sequence information arising from the Human 
Genome Project has added another facet to the analysis of 
cancer susceptibility, that of inter-individual variation at 
5 the genome level. Molecular epidemiology has already begun 
to clarify some of the gene-environment interactions that may 
lead to disease. The ultimate goal of molecular epidemiology 
is to develop risk assessment models for individuals, and 
already the field has provided insight into inter-individual 

10 variation in human cancer risk (Shields, 2000) . Molecular 
epidemiology focuses on three major determinants of human 
cancer risk: inherited host susceptibility factors, molecular 
dosimetry of carcinogen exposure, and biomarkers of early 
effects of carcinogenic exposure. The variability in 

15 metabolic activity, detoxification and DNA repair of the US 
population could be as high as 85-500-fold with 
correspondingly high variability in cancer risk (Hattis, 
1986) . Considering the latency of cancer, the importance of 
correlating individual risk with biomarkers at an early stage 

20 becomes apparent. These biomarkers can help to identify 

populations or individuals at risk of cancer resulting from 
specific environment-gene interactions. 

Defining the factors that contribute to inter-individual 
25 variations in cancer susceptibility has been a major focus of 
research for many years. Given the suggested role of 
environmental factors in carcinogenesis, some of the 
candidate genes are those that encode the xenobiotic- 
metabolising enzymes that activate or inactivate carcinogens. 
30 Variable levels of expression of these enzymes could result 
in increased or decreased carcinogen activation. Other 
genetic factors that could contribute to cancer 
susceptibility include genes involved in DNA repair, proto- 
oncogenes, tumour suppressor genes, cell-cycle genes, as well 
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as genes involved in aspects of nutrition, hormonal status, 
and immunological responses. Emerging data from the Human 
Genome Project has led to studies that show combinations of 
metabolic polymorphisms are increasingly being linked to a 
5 greater risk of cancer (Perera, 1997) . Studies which have 
measured the formation of DNA adducts as a marker of enzyme 
activity have found that the levels of DNA damage or protein 
adducts vary considerably between persons with apparently 
similar exposure (Bryant, 1987; Perera, 1992; Mooney, 1995) . 

10 The observed variability reflects a combination of true 

biologic factors, unaccounted for by differences in exposure 
or laboratory variation (Dickey, 1997) . In fact, lower 
exposures to carcinogens can result in proportionately higher 
adduct levels because of a person's genetic predisposition 

15 for increased carcinogen metabolic activation (Kato, 1995; 
Vineis, 1997) 

The existence of multiple alleles at loci that encode 
xenobiotic-metabolising enzymes can result in differential 

20 susceptibilities of individuals to the carcinogenic effects 
of various chemicals. Metabolism in humans occurs in two 
distinct phases: Phase I Metabolism involves the addition of 
an oxygen atom or a nitrogen atom to lipophilic (fat soluble) 
compounds such as steroids, fatty acids, xenobiotics (from 

25 external sources like diet, smoke, etc.) so that they can be 
conjugated to glutathione or N-acetylated by the Phase II 
enzymes (thus made water-soluble) and excreted from the body. 
There are superf amilies of xenobiotic-metabolising enzymes: 
cytochrome P450's (Phase I), GSTs (Phase II) and NATs (Phase 

30 I and II) which are thought to have evolved as an adaptive 
response to environmental insult. Alterations in the 
activity of these enzymes are predicted to result in an 
altered susceptibility to cancer (Hirvonen, 1999) . 



Enzymatic activation of xenobiotics is not, however, the only 
route to cancer development. Epidemiological studies suggest 
that nutritional factors may also play a causative role in 
more than 30% of human cancers. However, defining the 
precise roles of specific dietary factors in the development 
of cancer is difficult due to the multitude of variables 
involved (Perera, 2000) . Specific dietary factors are not 
easily measured as a single quantifiable variable, such as 
number of cigarettes smoked per day. Further complications 
arise due to differences in methodology, control populations, 
types of carcinogens, and amounts of exposure to carcinogens. 

Priorities for studies relating to the interrelationship of 
dietary factors and cancer susceptibility include 
identification of genetic factors that contribute to 
individual cancer risk, identification of cancer-preventat ive 
chemicals in fruits and vegetables, better understanding of 
carcinogenic role of polycyclic aromatic hydrocarbons and 
heterocyclic amines generated by cooking meats at high 
temperature, and better understanding of the role of 
increased caloric intake with increased cancer risk (Perera, 
2000) . 

Increased consumption of vegetables and fruits is correlated 
with a decreased risk of cancer, and studies of this aspect 
of nutritional effects on cancer has led to the 
identification of other enzymes and micronut rient s involved 
in the maintenance of a normal cellular phenotype 
(Giovannuci, 1999) . 

One quarter of the US population with low intake of fruits 
and vegetables has roughly twice the cancer rate for most 
types of cancer (lung, larynx, oral cavity, oesophagus, 
stomach, colon and rectum, bladder, pancreas, cervix, and 



ovary) when compared with the quarter with the highest intake 
(Ames, 1999) . Fruit and vegetables are high in folate and 
antioxidants. Low intake can lead to micronutrient 
deficiency, which has been shown to cause DNA damage in a way 
that mimics radiation damage by causing single and double- 
stranded breaks, oxidative lesions or both. The 
micronutrients correlated with DNA-damaging activity include 
folate (or folic acid), iron, zinc, and vitamins B12, B6, C 
and E (Ames, 1999) . 

Of the cancers that are correlated with nutritional effects, 
colon cancer (colorectal neoplasia) has among the strongest 
links to diet. In the US, colon cancer is the fourth most 
common incident cancer and second most common cause of cancer 
death in the US, with 130,000 new cases and 55,000 deaths per 
year (Potter, 1999) . According to the WHO, colorectal 
cancers are the second most common cause of cancer death in 
Britain (WHO, 1997). Worldwide colon cancer represents 8.5% 
of new cancer cases reported, with the highest rates seen in 
the developed world and the lowest rates in India. Colon 
cancer occurs with approximately equal frequency in men and 
women, and the occurrence appears to be highly sensitive to 
changes in the environment. Immigrant populations assume the 
incidence rates of the host country very rapidly, often 
within the generation of the initial immigrant (Potter, 
1999) . 

Risk factors for colon cancer include a positive family 
history, meat consumption, smoking and alcohol consumption 
(Giovannuci, 1999). There is an inverse relationship, i.e. 
lower risk, associated with consumption of vegetables, high 
folate intakes, use of non-steroidal anti-inflammatory drugs, 
hormone replacement therapy and physical activity. Meat and 
tobacco smoke are sources of carcinogens, while vegetables 
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are a source of folate, antioxidants, and have Phase II 
(detoxifying) enzyme-inducing ability (Taningher, 1999) . 

Diets rich in raw vegetables, green vegetables, and 
5 cruciferous vegetables have a decreased risk of colon cancer. 
Diets high in fibre, from vegetables and cereals, have been 
associated with a greater than two-fold decrease in risk of 
colorectal adenomas in men. The data on fruit in the diet is 
not as consistent to date (WCRF, 1997), but a recent report 

10 (Eberhart, 2000) measured potent anti-oxidant activity of 
phytochemicals in apple skins with the ability to inhibit 
growth of tumour cell lines in vitro, so it is possible that 
more clearly defined links will emerge in the future. Lower 
risk of colon cancer is associated with high folate intakes, 

15 but actual consumption of vegetables, rather than specific 
micronutrient preparations or vitamin supplements, has the 
most consistent low risk (Potter, 1999) . 

Other cancers that have been correlated with nutrition 
20 include prostate and breast. These malignancies are largely 
influenced by a combination of factors related to diet and 
nutrition. Prostate cancer is associated with high 
consumption of milk, dairy products and meats. These 
products decrease levels of 1,25 (OH) 2 vitamin D, which is a 
25 cell differentiator. Low levels of 1,25 (OH) 2 vitamin D may 
enhance prostate carcinogenesis by preventing cells from 
undergoing terminal differentiation and continuing to 
proliferate (Giovannucci , 1999) . Breast, colon, and prostate 
cancers are relatively rare in less economically developed 
30 countries, where malignancies of the upper gastrointestinal 
tract are quite common. The cancers of the upper 
gastrointestinal tract have been related to various food 
practices or preservation methods other than refrigeration. 
For example, cancer of the mouth and pharynx is the sixth 
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most common cancer world-wide and has been linked to alcohol 
consumption, tobacco, salt-preserved meat and fish, smoked 
foods and charcoal-grilled meat, as well as ingestion of 
beverages drunk very hot. Thus, diet can be a direct supply 
5 of genotoxic compounds or may cause chronic irritation or 
inflammation ( Giovannucci , 1999) . 

In recent years, many genes involved in the processes 
described above and other areas of metabolism have been found 

10 to exist in allelic form. Therefore, certain populations, 
subpopulations , races etc have greater or lesser 
susceptibility to particular diseases linked with variation 
in alleles of some genes. For many decades, health advice, 
for example relating to diet, exercise, smoking, sunbathing 

15 has been issued by Governments, charities and health advisory 
bodies, such advice has been directed only at the population 
as a whole, or, at best, to groups such as the elderly, 
children and pregnant women. Such advice can therefore only 
be very general and cannot, by its very nature, take account 

20 of the particular genotype of an individual. Moreover, in 

recent years, there has been much media publicity of research 
findings on links between particular foods, drugs etc and 
medical conditions, often causing health scares. As the 
factors that contribute to disease susceptibility, for 

25 example cancer, or cardiovascular disease susceptibility vary 
between populations and between individuals of populations, 
it is often impossible for an individual to derive useful 
advice appropriate to his or her particular circumstances 
from such reports . 

30 

Summary of The Invention 



In order to enable individuals to protect and manage their 
own health, there is a need for individuals to have 
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personally- tailored information about risk factors which may- 
be important to that individual's well-being and personally- 
tailored advice on reducing the risk of disease. 

5 Accordingly, the invention provides a computer assisted 

method of providing a personalized lifestyle advice plan for 
a human subject comprising: 

(i) providing a first dataset on a data processing means, 
10 said first dataset comprising information correlating the 

presence of individual alleles at genetic loci with a 
lifestyle risk factor, wherein at least one allele of each 
genetic locus is known to be associated with increased or 
decreased disease susceptibility; 

15 

(ii) providing a second dataset on a data processing means, 
said second dataset comprising information matching each said 
risk factor with at least one lifestyle recommendation; 

20 (iii) inputting a third dataset identifying alleles at one or 
more of the genetic loci of said first dataset of said human 
subject; 

(iv) determining the risk factors associated with said 
25 alleles of said human subject using said first dataset; 

(v) determining at least one appropriate lifestyle 
recommendation based on each identified risk factor from step 
(iv) using said second dataset; and 

30 

(vi) generating a personalized lifestyle advice plan based on 
said lifestyle recommendations. 



By lifestyle risk factors, it is meant risk factors 
associated with dietary factors, exposure to environmental 
factors, such as smoking, environmental chemicals or 
sunlight. Similarly lifestyle recommendations should be 
5 interpreted as relating to recommendations relating to 

dietary factors and exposure to environmental factors, such 
as smoking, environmental chemicals or sunlight. Disease 
susceptibility should be interpreted to include 
susceptibility to conditions such as allergies. 

10 

Thus, the method allows individualised advice to be generated 
based on the unigue genetic profile of an individual and the 
susceptibility to disease associated with the profile. By 
individually assessing the genetic make-up of the client, 
15 specific risk factors can be identified and dietary and other 
health advice tailored to the individual's needs. In a 
preferred embodiment, the lifestyle advice will include 
recommended minimum or maximum amounts of foodtypes. (Note 
that an amount may be 0) . 

20 

Information concerning the sex and health of the individual 
and /or of the individual's family may also provide 
indications that a particular polymorphism or group of 
polymorphisms associated with a particular condition should 
25 be investigated. Such information may therefore be used in 
selection of polymorphisms to be screened for in the method 
of the invention. 

Such factors may also be used in the determination of 
30 appropriate lifestyle recommendations in step (v) of the 
method. For example, recommendations relating to reducing 
susceptibility to prostate cancer would not be given to women 
and recommendations relating to susceptibility to ovarian 
cancer would not be given to men. Other factors, such as 
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information regarding the age, alcohol consumption, and 
existing diet of the client may be incorporated into the 
determination of appropriate lifestyle recommendations in 
step (v) . 

5 

The report comprising the personalised dietary advice may be 
delivered to the client by any suitable means, for example by 
letter, facsimile or electronic means, such as e-mail. 
Alternatively, the report may be posted on a secure Web-page 

10 of the service provider with access limited to the client by 
the use of a unique identifier notified to the client either 
by conventional or electronic mail. The report can therefore 
comprise one or more hyperlinks to other documents of the 
report provider's Web-site or to other Web-sites giving 

15 relevant information on the particular polymorphisms 
identified, disease prevention and/or dietary advice. 

As such sites would be able to be updated and new hyperlinks 
added to the report after the report is initially delivered 
20 to the client, the information and advice would be able to be 
updated at any time, thereby allowing the client to access 
up-to-date yet personalised health and dietary advice over a 
prolonged period, without the need for requesting another 
report . 

25 

Preferably, the method will involve assessing a variety of 
loci in order to give a broad view of susceptibility and 
possible means of minimising disease risk. Although 
individual polymorphisms may be considered biomarkers for 
30 individual cancer risk, the different biomarkers, when 

considered together, may also reveal a significant cancer 
risk. For example, the correlation between CYP1A1 activity 
and cancer susceptibility varies, dependent on the presence 
of specific types of CYP1A1 polymorphism as well as the 
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presence of GSTM1 polymorphisms. An individual with an 
extremely active CYP1A1 gene, leading to high Phase I P450 
activity in combination with a null GSTM1 genotype that lacks 
the detoxifying Phase II activities has a very high risk of 
5 developing cancer (Taningher, 1999) . 

The presence of a particular polymorphism may be indicative 
of increased suscept ibilty to one disease while being 
indicative of decreased susceptibility to another disease. 

10 For example, one allele of the gene encoding epoxide 

hydrolase, which catalyses the conversion of toxic PAH 
metabolites formed by CYP1A1 and CYP1A2 into less toxic and 
more water-soluble trans-dihydrodiols , has recently been 
found to be associated with increased risk of aflatoxin- 

15 induced liver cancer, but also with decreased risk of ovarian 
cancer (Pluth, 200; Taningher, 1999) . 

Therefore, it will be important to assess the risk factors 
associated with other polymorphisms to give meaningful advice 
20 on maintaining optimal health. 

Preferred genes for which polymorphisms are identified 
include genes that encode Phase I metabolism enzymes 
responsible for detoxification of xenobiotics, genes that 

25 encode Phase II metabolism enzymes responsible for further 
detoxification and excretion of xenobiotics, genes that 
encode enzymes that combat oxidative stress, genes associated 
with micronutrient deficiency (for example, deficiency of 
folate, B12 or B6) , genes that encode enzymes responsible for 

30 metabolism of alcohol, genes that encode enzymes involved in 
lipid and/or cholesterol metabolism, genes that encode 
enzymes involved in clotting, genes that encode trypsin 
inhibitors, genes that encode enzymes related to 
susceptibility to metal toxicity, genes which encode proteins 



12 



required for normal cellular metabolism and growth and genes 
which encoded HLA Class 2 molecules. 

The method of the invention may include the step of 
5 determining the presence of individual alleles at one or more 
genetic loci of the DNA in a DNA sample of the subject, and 
constructing the dataset used in step (iii) using results of 
that determination. 

10 Techniques for determining the presence or absence of 

individual alleles are known to the skilled person. They may 
include techniques such as hybridization with allele-specif ic 
oligonucleotides (ASO) (Wallace, 1981; Ikuta, 1987; Nickerson, 
1990, Varlaan-de Vries, 1986, Saiki, 1989 and Zhang, 1991) 

15 allele specific PCR (Newton 1989, Gibbs, 1989) , solid-phase 
minisequencing (Syvanen, 1993), oligonucleotide ligation 
assay (OLA) (Wu, 1989, Barany, 1991; Abravaya, 1995), 5' 
fluorogenic nuclease assay (Holland, 1991 & 1992, Lee, 1998) 
US patents 4,683,202, 4,683,195, 5,723,591 and 

20 5,801,155, or Restriction fragment length polymorphism (RFLP) 
(Donis-Keller, 1987) . 

In a preferred embodiment, the genetic loci are assessed via 
a specialised type of PCR used to detect polymorphisms, 

25 commonly referred to as the Taqman® assay, in which 

hybridisation of a probe comprising a fluorescent reporter 
molecule, a fluorescent quencher molecule and a minor groove 
binding chemical to a region of interest is detected by 
removal of quenching of the fluorescent molecule and 

30 detection of resultant fluorescence. Details are given below. 

In another embodiment, the genetic loci are assessed via 
hybridisation with allele-specif ic oligonucleotides, the 
allele specific oligonucleotides being preferably arranged as 
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an array of oligonucleotide spots stably associated with the 
surface of a solid support. 

The arrays suitable for use in the method of the invention 
5 form a further aspect of the present invention. 

In order to assay the sample for the alleles to be identified 
the fragments of DNA comprising the gene(s) of interest may 
be amplified to produce a sufficient amount of material to be 
10 tested. 

The present inventors have designed a number of specific 
primer sets for amplification of gene regions of interest. 
Such primers may be used in pairs to isolate a particular 

15 region of interest in isolation. Therefore in a further 

aspect of the invention, there is provided a primer having a 
sequence selected from SEQ ID NO: 86-99, 104-163. In another 
aspect, there is provided a primer pair comprising primers 
having SEQ ID NO:n, where n is an even number from 86 -98 or 

20 104-162 in conjunction with a primer having SEQ ID NO: (n+1) . 

Preferably, however, the primer sets will be used together 
with other primer sets to provide multiplexed amplification 
of a number of regions to allow determination of a number of 
25 polymorphisms from the same sample. Therefore in a further 
aspect of the invention, there is provided a primer set 
comprising at least 5, more preferably 10, 15 primer pairs 
selected from SEQ ID NO: 86-121. 

30 Brief Description of the Drawings 

Figure 1 shows examples of databases 1 and 2 which may be 
used in an embodiment of the present invention. 
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Figure 2 is a flow chart illustrating an embodiment of the 
invention . 

Detailed Description of the Invention 
5 Selection of Genetic Polymorphisms for Datasets 

The correct selection of genetic polymorphisms is important 
to the provision of accurate and meaningful advice. Although 
not limited to such classes of polymorphisms, in a preferred 
10 embodiment of the present invention, markers for 

polymorphisms of one or more of the following classes of 
genes are used: 

The first dataset of the method of the invention may comprise 
15 information relating to two or more alleles of one or more 
genetic loci of genes selected from the group comprising: 

(a) genes that encode enzymes responsible for detoxification 
of xenobiotics in Phase I metabolism; 

(b) genes that encode enzymes responsible for conjugation 
20 reactions in Phase II metabolism; 

(c) genes that encode enzymes that help cells to combat 
oxidative stress; 

(d) genes associated with micronutrient deficiency; 

(e) genes that encode enzymes responsible for metabolism of 
25 alcohol. 

(f) genes that encode enzymes involved in lipid and/or 
cholesterol metabolism; 

(g) genes that encode enzymes involved in clotting; 

(h) genes that encode trypsin inhibitors; 

30 (i) genes that encode enzymes related to susceptibility to 
metal toxicity; 

(j) genes which encode proteins required for normal cellular 
metabolism and growth; 

(k) genes which encoded HLA Class 2 molecules. 
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The dataset will preferably comprise information relating to 
two or more alleles of at least two genetic loci of genes 
selected from the group comprising categories a - k as 
5 described above, for example, a+b, a+c, a+d, a+e, a+f, a+g, 
a+h, a+i, a+j , a+k, b+c, b+d, b+e etc., c+d, c+e etc, d+e, 
d+f etc, e+f, e+g etc, f+g, f+h etc., g+h, g+i, g+k, h+i, 
h+k. Where the dataset comprises information relating to two 
or more alleles of at least two genetic loci, it is preferred 

10 that at least one of the genetic loci is of category d, due 
to the central role of micronut rient s in the maintenance of 
proper cellular growth and DNA repair, and due to the 
association of micronut rient metabolism or utilisation 
disorders with several different types of diseases (Ames 

15 1999; Perera, 2000; Potter, 2000). More preferably, the 

dataset will preferably comprise information relating to two 
or more alleles of at least three genetic loci selected from 
the group comprising categories a - k as described above. 
Where the dataset comprises information relating to alleles 

20 of at least three genetic loci, it is preferred that at least 
two of the genetic loci are of categories d and e. 
Information relating to polymorphisms present in both of 
these categories is particularly useful due to the effects of 
alcohol consumption and metabolism on the efficiency of 

25 enzymes related to micronutrient metabolism and utilisation 

(Ulrich, 1999) . In a further preferred embodiment, where the 
dataset comprises information relating to alleles of at least 
three genetic loci, it is preferred that at least two of the 
genetic loci are of categories a and b due to the close 

30 interaction of Phase I and Phase II enzymes in the metabolism 
of xenobiotics. Even more preferably, the dataset will 
comprise information relating to two or more alleles of at 
least four genetic loci of genes selected from the group 
comprising categories a - k as defined above, for example, 
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a+b+c+d, a+b+c+e, a+b+d+e, a+c+d+e, b+c+d+e etc. Where the 
dataset comprises information relating to alleles of at least 
four genetic loci, it is preferred that at least three of the 
genetic loci are of categories d and e and f Information 
5 relating to polymorphisms present in these three categories 
is particularly useful due to the strong correlation of 
polymorphisms of these alleles with coronary artery disease 
due to the combined effects of altered micronutrient 
utilisation, affected adversely by alcohol metabolism, 

10 together with imbalances in fat and cholesterol metabolism. 

Further, where the dataset comprises information relating to 
alleles of at least five genetic loci, it is preferred that 
at least four of the genetic loci are of categories a, b, d 
and e. Information relating to polymorphisms present in 

15 these four categories is particularly useful due to the 
combined effects of micronutrient s utilisation, alcohol 
metabolism, Phase 1 metabolism of xenobiotics and Phase II 
metabolism on the further metabolism and excretion of 
potentially harmful metabolites produced in the body 

20 (Taningher, 1999; Ulrich, 1999) . Similarly, the dataset may 
comprise information relating to two or more alleles of at 
least five, for example a, b, d, e and f, six, seven, eight, 
nine or ten genetic loci of genes selected from the group 
comprising categories a - k as defined above. 

25 

Preferably, the dataset will comprise information relating to 
two or more alleles of one or more genetic loci of genes 
selected from each member of the group comprising categories 
a - k as described above. In a preferred embodiment, the 
30 first dataset comprises information relating to two or more 
alleles of the genetic loci of genes encoding each of the 
cytochrome P450 monooxygenase, N-acetylt rans f erase 1, N- 
acetyltransf erase 2, glutathione-S -transf erase, manganese 
superoxide dismutase, 5 , 10-methylenetet rahydrof olatereductase 
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and alcohol dehydrogenase 2 enzymes. In a more preferred 
embodiment the first dataset further comprises information 
relating to two or more alleles of the genetic loci of genes 
encoding one or more, preferably each of epoxide hydrolase 
5 (EH), NADPH-quinone reductase (NQ01) , paraxonaoase (P0N1), 
myeloperoxidase (MPO) , alcohol dehydrogenase 1, alcohol 
dehydrogenase 3, cholesteryl ester transfer protein, 
apolipoprotein A IV, apolipoprotein E, apolipoprotein C III, 
angiotensin, factor VII, prothrombin 20210, (3 - f ibrinogen, 

10 heme -oxygenase-1 , a-antitrypsin, SPINK1, A-aminolevulinacid 
dehydratase, interleukin 1, interleukin 1, vitamin D 
receptor, Bl kinin receptor, cys t athionine-beta-synthase , 
methionine synthase (B12 MS), 5-HT transporter, transforming 
growth factor beta 1 (TGFpl), L-myc, HLA Class 2 molecules, 

15 T-lymphocyte associated antigen 4 (CTLA-4), interleukin 4, 
interleukin 3, interleukin 6, IgA, and/or galactose 
metabolism gene GALT . 

Genes that encode enzymes responsible for (a) detoxification 
2 0 of xenobiotics in Phase I metabolism; and(b) conjugation 
reactions in Phase II metabolism 

Xenobiotics are potentially toxic compounds found in, for 
example, char-grilled red meat. Meat consumption is 

25 associated with increased risk of cancer, especially well- 
done meat cooked at high temperatures (Sinha, 1999) . Cooking 
meat in this fashion leads to the production of heterocyclic 
amines (HCA) , nitrosamines (NA) , and polycyclic aromatic 
hydrocarbons (PAH), which have known carcinogenic activity in 

30 animals (Hirvonen, 1999; Layton, 1995) . 

Detoxification of xenobiotics occurs in 2 phases in humans: 
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Phase I metabolism involves the addition of an oxygen atom or 
a nitrogen atom to lipophilic (fat soluble) compounds, such 
as steroids, fatty acids, xenobiotics (from external sources 
like diet, smoke, etc.) so that they can be conjugated by the 
5 Phase II enzymes (thus made water-soluble) and excreted from 
the body (Hirvonen, 1999) . Individuals with genetic 
polymorphisms correlated with cancer risk in these genes 
should avoid consumption of char-grilled foods, smoked fish, 
well-done red meat whether grilled or pan-fried (Sinha, 
10 1999) . They should also increase consumption of food 
products known to increase Phase II metabolism so the 
products of Phase I metabolism may be cleared more 
efficiently . 

15 Specific examples of genes of category a for which 

information relating to polymorphisms may be used in the 
present invention include genes encoding cytochrome P450 
monooxygenase (CYP) e.g. CYP1A1, CYA1A2, CYP2C , CYP2D6, 
CYP2E1, CYP3A4, CYP11B2, genes encoding N-acetyltransf erase 1 

20 e.g. NAT1, genes encoding N-acetyltransf erase 2 e.g. NAT 2 , 

genes encoding epoxide hydrolase (EH), genes encoding NADPH- 
quinone reductase (NQ01, genes encoding paraxonaoase (PON1), 
genes encoding myeloperoxidase (MPO) . 

25 CYP is also referred to as cytochromome P450 monooxygenase 
(gene is called CYP, enzyme is called P450) . P450 enzymes 
belong to a super-family with wide substrate activity that 
catalyses the insertion of an oxygen atom into a substrate. 
The reaction can convert a molecule (procarcinogen) into a 

30 DNA-reactive elect rophilic carcinogen (Hirvonen, 1999; Smith, 
1995) . Polymorphisms in genes encoding cytochrome P450 (CYP 
family of genes) are associated with altered susceptibility 
to cancer, CAD and altered metabolisim of various 
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pharmaceutical agents (Poolsup, 2000; Miki, 1999; 
Cramer, 2 000; Marchand, 1999; Sinha, 1997). 

CYP1A1 codes for a P450 enzyme that metabolises polycyclic 
5 aromatic hydrocarbons (PAH) . The CYP1A1 gene is polymorphic 
and is inducible by PAH , which means that expression of the 
enzyme is increased upon exposure to PAH (MacLeod, 1997) . 
CYP1A1 is located on chromosome 15q22-q24 (Smith, 1995) . 
This gene has been linked to colorectal, urinary bladder, 

10 breast, oral cavity, stomach, and lung cancers (Perera, 2000; 
Garte, 1998) . The gene product, the P450 enzyme, is 
inducible by exposure to the agents that it metabolises, so 
the consumption of high levels of a potential source of 
carcinogens, such as well-done red meat, would increase the 

15 production of the enzyme and thus the creation of 

carcinogenic substances (Mooney, 1996; Perera, 2000; 
Alexandrie, A.K., 2000). Studies of polymorphisms of the 
CYP1A1 gene have revealed considerable differences in enzyme 
activity, with corresponding differences in cancer risk after 

20 exposure to known substrates of the enzyme (Alexandrie, 2000; 
Rojas, 2000; Garte, 2000) . Both the Ile-Val polymorphism I, 
which comprises an A4889G substitution (i.e. the adenine 
residue at position 4889 of the 5' - 3' strand is substituted 
by a guanine residue) and the CYP1A1*C polymorphism, which 

25 comprises an T6235C substitution, are induced to a greater 
extent than the wild type gene after exposure to PAH, and 
have been associated with a significant increase in cancer 
risk (Taningher, 1999; Garte, 1998; Kawajiri, 1996; MacLeod, 
S., 1997; Smith, 1995). Approximately 10 percent of the 

30 Caucasian population carries polymorphisms linked to cancer 
risk, according to a recent American review paper (Shields, 
2000) . Polymorphisms in genes encoding CYP1A2, CYP2C, CYP2D6, 
CYP2E1, CYP3A4, CYP11B2 are associated with altered 
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susceptibility to cancer and drug sensitivity. (Poolsup, 
2000; Miki, 1999; Cramer, 2000; Marchand, 1999; Sinha, 1997) . 

NAT1 (N-acetyltransf erase 1) and NAT 2 (N-acetyltransf erase 2) 
5 also activate PAH and heterocyclic amines (HAA) . The enzymes 
catalyse N-acetylation, O-acetylation, and N, O-acetylation . 
The O-acetylation reaction is considered the most risky, with 
the potential for forming chemical carcinogens that can bind 
to DNA. The N-acetylation reaction can occur on a compound 

10 after a P450 has inserted an oxygen, thus increasing the 

water solubility of the compound so it may be excreted. Due 
to this activity, the NAT genes are often considered as both 
Phase I and Phase II type enzymes. The literature describing 
a cancer link focuses on the activation activity of the 

15 enzymes, so they will be listed in the Phase I section only. 
There are 3 separate N-acety It rans f erase genes in humans, two 
are active genes: NAT1 and NAT 2 , and a pseudogene, NATP. 
Pseudogenes have the same sequence, but lack apparent 
function and promoter elements and are not expressed in cells 

20 (i.e. the gene is not transcribed into RNA then translated 
into amino acids to make a protein/enzyme) (Perera, 2000). 
NAT1 and NAT 2 genes are located on chromosome 8 at 8p21.3- 
21.1, both genes are 870 bp long and both code for a protein 
290 amino acids in length. The genes are highly polymorphic 

25 and epidemiological studies have sometimes given conflicting 
information regarding links with cancer. The genes show 
geographical and ethnic variation and the enzyme activity 
varies considerably within different tissues or organs. 
There are approximately 20 polymorphisms for NAT 1 known to 

30 date, but the list below only includes the polymorphisms that 
have shown a link to cancer (Hein, 2000a) . The current list 
of nomenclature and polymorphisms is kept at a web site: 
http : //www . louisville . edu/medschool /pharmacology /NAT . html . 
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Many of the epidemiological studies of both NAT1 and NAT 2 
used phenotyping assays, which measured enzyme activity, and 
found fast and slow acetylator types, with the fast phenotype 
carrying an increased risk for cancer in the colon (Perera, 
5 2000) . However, later analysis of the results found that the 
fast/slow phenotype could vary considerably depending on the 
substrate chosen for acetylation (Hein, 2000a) . Recent 
studies have used genetic sequence data to more precisely 
match acetylator activity and cancer risk with polymorphism 
10 (Hein, 2000b) . Although the genes are the same size, they do 
act on different substrates. For example, caffeine is a 
substrate for NAT 2 but not for NATl. 

NATl is expressed to a higher degree than NAT 2 in the colon, 

15 so NATl may be associated with localised activity of 

activated HAA or PAH in the colon (Brockton, 2000; Perera, 
2000). The polymorphism NAT1*10,, which comprises T1088A and 
C1095A substitutions, and which has a fast phenotype, has 
been consistently linked with an increased risk of colon 

20 cancer and higher DNA adduct levels (i.e. DNA damage that can 
lead to cancer) in colon tissue (Perera, 2000; Ilett, 1987) . 
The NAT1*11 polymorphism has been linked to risk of breast 
cancer in women who smoke or consume well-done red meat 
(Zheng, 1999) . However, the phenotype is not well understood, 

25 so this marker cannot be categorized as a fast or slow 
acetylator (Doll, 1997) . Two alleles of the NAT1*11 
polymorphism are known: the NAT1*11A polymorphism, which 
comprises C(-344)T, A(-40)T, G445A, G459A, T640G, C1095A 
substitutions and a A9: 1065-1090 deletion; and the NAT1*11B 

30 polymorphism, which comprises C(-344)T, A(-40)T, G445A, 
G459A, T640G substitutions and a A9: 1065-1090 deletion. 
References to NATl* 11 polymorphisms should be understood to 
include reference to NAT1*11A or NAT1*11B polymorphisms. 
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NAT1*14 on the other hand has little or no enzyme activity 
(Brockton, 2000) and has been associated with increased lung 
cancer risk (Bouchardy, C. , 1998). Two alleles of the NAT1*14 
polymorphism are known: the NAT1*14A polymorphism, which 
5 comprises G560A, T1088A and C1095A substitutions; and the 

NAT1*14B polymorphism, which comprises a G560A substitution. 
References to NAT1*14 polymorphisms should, except where the 
context dictates otherwise, be understood to include 
reference to NAT1*14A or NAT1*14B polymorphisms. The NAT1*14 

10 polymorphism shares a restriction enzyme site with the 

NATl*llpolymorphism, and some of the conflicting results 
reported in the literature are believed to be due to the 
inability of the assay used (restriction fragment length 
polymorphism assay ( RFLP ) ) to distinguish the polymorphisms 

15 (Hein, 2000a) . The oligonucleotide array suitable for use in 
the present invention can distinguish all polymorphisms and 
therefore will be more precise than the RFLP procedure. 

NAT 2 is expressed primarily in the liver, but has been linked 
20 with cancer incidence in other organs (Hein, 2000b) . 

NAT2 * 5A, which comprises T481C and T341C substitutions, 
NAT 2 * 6A, which comprises C282T and G590A substitutions, 
NAT2*7A, which comprises a G857A substitution, have reduced 
acetylation activity (Hein, 2000b) and have been linked to 
25 risk of bladder cancer (Taningher, 1999; Lee, 1998) . NAT2*4, 
is considered the normal, or wild type, sequence. NAT2*4 has 
fast acetylator activity and has been linked to increased 
cancer risk in several studies (reviewed in Hein, 2000b; Gil, 
1998), but especially in conjunction with the NAT1*10 
30 polymorphism (Bell, 1995) . NAT 2 rapid/intermediate 

acetylators with at least one NAT2*4 allele have been linked 
to breast cancer in women who consumed well-done red meat 
(Dietz, 1999) . Approximately 55% of the Caucasian population 
carry NAT1 polymorphisms linked to cancer. (Shields, 2000) . 
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Polymorphisms in genes encoding epoxide hydrolase are 
associated with cancer and chronic obstructive pulmonary 
disease (Pluth, 200; Miki,1999). Polymorphisms in genes 
5 encoding NADPH-quinone reductase are associated with altered 
susceptibility to cancer (Nakajima, 2000). Polymorphisms in 
genes encoding paraxonoase are associated with altered 
susceptibility to cancer and to CAD (MacKness, 2000) . 
Polymorphisms in genes encoding myeloperoxidase are 
10 associated with altered susceptibility to CAD (Schabath, 
2000) . 

Specific examples of genes of category b for which 
information relating to polymorphisms may be used in the 
15 present invention include genes encoding glutathione-S - 
transferase e.g GSTM1, GSTP1, GSTTl. 

Glutathione-S-transf erases catalyse the reaction of 
electrophilic compounds with glutathione so the compounds may 

20 be excreted from the body. The enzymes belong to a super- 
family with broad and overlapping substrate specificities. 
Glutatione-S-transf erases provide a major pathway of 
protection against chemical toxins and carcinogens and are 
thought to have evolved as an adaptive response to 

25 environmental insult, thus accounting for their wide 

substrate specificity (Hirvonen, 1999) . There are 4 family 
members: alpha, mu, theta, and pi, also designated as A, M,T 
and P. Polymorphisms have been identified in each family 
(Perera, 2000) . Individuals with low glutathione-S- 

30 transferase activity should avoid meats cooked at higher 
temperatures as above, and increase fruit and vegetable 
consumption. Cruciferous vegetables such as broccoli and 
members of the allium family such as garlic and onion have 
been shown to be potent inducers of these enzymes, which 
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would be expected to increase clearance of toxic substances 
from the body (Cotton, 2000; Giovannucci, 1999) . 

GSTmu, has 3 alleles: null, a,, which is considered to be 
5 the wild type, and b, which comprises a C534G 

substitution, with no functional difference between the a and 
b alleles. The GSTmu sub-type has the highest activity of 
the 4 types and is predominately located in the liver 
(Hirvonen, 1999) . Approximately half of the population has a 

10 complete deletion of this gene with a corresponding risk of 
lung, bladder, breast, liver, and oral cavity cancer 
(Shields, 2000; Perera, 2000) . It has been estimated that 
17% of all lung and bladder cancers may be attributable to 
GSTM1 null genotypes (Hirvonen, 1999) . GSTM1 null genotype 

15 together with a highly active CYP1A1 polymorphism has been 

linked to a very high cancer risk in several studies (Rojas, 
2000; Shields, 2000) . The GSTM1 gene is located on 
chromosome lpl3.3 (Cotton, 2000). 

20 GSTpi gene is located on chromosome llql3. This sub-type is 
known to metabolise many carcinogenic compounds and is the 
most abundant sub-type in the lungs (Hirvonen, 1999) . Two 
single nucleotide polymorphisms have been linked to cancer to 
date GSTP1*B, which comprises an A313G substitution, and 

25 GSTP1*C, which comprises a C341T substitution. The enzymes 
of these polymorphic genes have decreased activity compared 
to the wild type and a corresponding increased risk of 
bladder, testicular, larynx and lung cancer (Harries, 1997; 
Matthias, 1998; Ryberg, 1997). 

30 

GSTtheta gene is on chromosome 22qll.2 and is deleted in 
approximately 20% of the Caucasian population. The enzyme is 
found in a variety of tissues, including red blood cells, 
liver, and lung (Potter, 1999). The deletion is associated 
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with an increased risk of lung, larynx and bladder cancers 
(Hirvonen, 1999) . Links with GSTM1 null genotypes are 
currently being searched, as it is believed that individuals 
that have both GSTM1 and GSTT1 alleles deleted will have a 
5 greatly increased risk of developing cancer (Potter, 1999) . 

Genes that code for enzymes that help cells to combat 
oxidative stress 

10 Specific examples of genes of category c for which 

information relating to polymorphisms may be used in the 
present invention include genes encoding manganese superoxide 
dismutase (MnSOD or SOD2 gene) . 

15 Manganese superoxide dismutase is an enzyme that destroys 
free radicals or a free-radical scavenger. The gene is 
located on chromosome 6g25.3, but the enzyme is found within 
the mitochondria of cells. There are 2 polymorphisms linked 
to cancer to date, an lie 58Thr allele, which comprises an 

20 T175C substitution, and a Val(-9)Ala allele, which comprises 
a T(-28)C substitution,. A study of premenopausal women found 
a four-fold increased risk of breast cancer in individuals 
with the Val(-9)Ala polymorphism and the highest risk within 
this group is found in women who consumed low amounts of 

25 fruits and vegetables (Ambrosone, 1999) . This polymorphism 
occurs in the signal sequence of the amino acid chain. The 
signal sequence ensures transport of the enzyme into the 
mitochondria of the cell, and so the polymorphism is believed 
to reduce the amount of enzyme delivered to the mitochondria 

30 (Ambrosone, 1999) . The mitochondria is commonly referred to 
as the workhorse of the cell, where the energy-yielding 
reactions take place. This is the site of many oxidative 
reactions, so many free radicals are generated here. 
Individuals with low activity of this enzyme should be 
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advised to take antioxidant supplements and increase 
consumption of fruits and vegetables (Giovannucci, 1999; 
Perera, 2000) . 

5 Genes associated with Micronut rient deficiency e.g. of 
folate, vitamin B12 or vitamin B6 

Specific examples of genes of category d for which 
information relating to polymorphisms may be used in the 
10 present invention include the gene encoding 5,10- 

methylenetetrahydrofolatereductase (MTHFR) activity. 

5, 10-methylenetetrahydrof olate reductase is active in the 

f olate-dependent methylation of DNA precursors. Low activity 

15 of this enzyme leads to an increase of uracil incorporation 
into DNA (instead of thymine) (Ames, 1999) . The MTHFR gene 
is polymorphic and has been linked to colon cancer, adult 
acute lymphocytic leukaemia and infant leukaemia (Ames, 1999; 
Perera, 2000; Potter, 2000) . Both the wt and polymorphic 

20 alleles have been linked to disease, each being dependent on 
levels of folate in the diet. Approximately 35% of the 
Caucasian population has genetic polymorphisms at this locus 
with corresponding risk of colon cancer (Shields, 2000) . 
Polymorphisms at this locus include those with a C677T or 

25 A1298C substitution. Dietary recommendations for individuals 
lacking in MTHFR activity include taking supplements with 
folate and increasing consumption of fruit and vegetables 
(Ames, 1999) . Low levels of vitamins B12 and B6 have been 
associated with low MTHFR activity and increased cancer risk, 

30 so individuals should increase intake of these vitamins; B12 
is found primarily in meat and B6 is found in whole grains, 
cereals, bananas, and liver (Ames, 1999) . Alcohol has a 
deleterious effect on folate metabolism, affecting 
individuals with the A1298C polymorphism most severely 
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(Ulrich, 1999) . These individuals should be advised to avoid 
alcohol . 

Genes that code for enzymes responsible for metabolism of 
5 alcohol 

Specific examples of genes of category e for which 
information relating to polymorphisms may be used in the 
present invention include genes encoding alcohol 
10 dehydrogenase e.g. the ALDH2 gene, ALDH1 gene and ALDH3 gene. 

Alcohol dehydrogenase 2 (ALDH2) is involved in the second 
step of ethanol utilisation. Reduced activity of this enzyme 
leads to accumulation of acetaldehyde , a potent DNA adduct 

15 former (Bosron, 1986) . There has been one polymorphism 

identified to date, the ALDH2*2 polymorphism, which comprises 
a G1156A substitution, and which has links with 
oesophageal/throat cancer, stomach, lung, and colon cancer 
(IARC, 1998; Yokoyama, 1998) . The advice to individuals with 

20 the polymorphism would be to avoid alcohol. Polymorphisms in 
ALDH1 and 3 are associated with increased susceptibility to 
cancers and Parkinson's disease. 

Genes that encode enzymes involved in lipid and/or 
2 5 cholesterol metabolism 

Specific examples of genes of category f for which 
information relating to polymorphisms may be used in the 
present invention include genes encoding cholesteryl ester 
30 transfer protein e.g. the CETP gene, polymorphisms of which 
genes are associated with altered susceptibility to coronary 
artery disease (CAD) ( (Raknew, 2000; Ordovas, 2000); genes 
encoding apolipoprotein A, IV (ApoA-IV) , polymorphisms of 
which genes are associated with altered susceptibility to 
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coronary artery disease (CAD) (Wallace, 2000 ; Heilbronn, 
2000); apolipoprotein E(ApoE), polymorphisms of which genes 
are associated with altered susceptibility to CAD and 
Alzheimer's disease (Corbo,1999; Bullido, 2000); or 
5 apolipoprotein C, III (ApoC-III), polymorphisms of which 
genes are associated with altered susceptibility to CAD, 
hypertension and insulin resistance (Salas, 1998). 

Genes that encode enzymes involved in clotting mechanisms 

10 

Specific examples of genes of category g for which 
information relating to polymorphisms may be used in the 
present invention include genes encoding angiotensin (AGT-1) 
and angiotensin converting enzyme (ACE) , polymorphisms of 

15 which genes are associated with altered susceptibility to 

hypertension (Brand 2000;de Padua Mansur, 2000), factor VII, 
polymorphisms of which genes are associated with altered 
susceptibility to CAD (Donati, 2000; Di Castelnuovo, 2000); 
prothrombin 20210, polymorphisms of which genes are 

20 associated with altered susceptibility to venous thrombosis 
(Vicente, 1999) ; p-f ibrinogen, polymorphisms of which genes 
are associated with altered susceptibility to CAD (Humphries, 
1999); or heme -oxygenase-1 , polymorphisms of which genes are 
associated with altered susceptibility to emphysema (Yamada, 

25 2000) . 

Genes that encode trypsin inhibitors 

Specific examples of genes of category h for which 
30 information relating to polymorphisms may be used in the 
present invention include genes encoding a-antitrypsin, 
polymorphisms of which genes are associated with altered 
susceptibility to chronic obstructive pulmonary disease 
(COPD) (Miki, 1999) ; or serine protease inhibitor, Kazal type 
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1 (SPINK) , polymorphisms of which genes are associated with 
altered susceptibility to pancreatitis (Pfutzer, 2000) . 

Genes that encode enzymes related to susceptibility to metal 
5 toxicity 

Specific examples of genes of category i for which 
information relating to polymorphisms may be used in the 
present invention include genes encoding A-aminolevulinacid 
10 dehydratase, polymorphisms of which genes are associated with 
altered susceptibility to lead toxicity (Costa, 2000) . 

Genes which encode proteins required for normal cellular 
metabolism and growth 

15 

Specific examples of genes of category j for which 
information relating to polymorphisms may be used in the 
present invention include genes encoding the vitamin D 
receptor, polymorphisms of which genes are associated with 

20 altered susceptibility to osteoporosis, tuberculosis, Graves 
disease, COPD, and early periodontal disease (Ban, 2000; 
Wilkinson, 2000; Gelder, 2000; Miki, 1999; Hennig, 1999); the 
Bl kinin receptor (B1R) , polymorphisms of which genes are 
associated with altered susceptibility to kidney disease 

25 (Zychma, 1999) ; cystathionine-beta-synthase, polymorphisms of 
which genes are associated with altered susceptibility to CAD 
(Tsai, 1999); methionine synthase (B12 MS), polymorphisms of 
which genes are associated with altered susceptibility to CAD 
(Tsai, 1999) ; the 5-HT transporter, polymorphisms of which 

30 genes are associated with altered susceptibility to 

neurological disorders, Alzheimer's disease, schizophrenia, 
other disorders of the serotonin pathway (Oliveira, 1999) ; 
tumour necrosis factor receptor 2 (TNFR2) , polymorphisms of 
which genes are associated with altered susceptibility to CAD 
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(Fernandez-Real, 2000); galactose metabolism gene GALT, 
polymorphisms of which genes are associated with altered 
susceptibility to ovarian cancer (Cramer, 2000) ; transforming 
growth factor beta 1 (TGF(31), polymorphisms of which genes 
5 are associated with altered susceptibility to CAD and cancers 
(Yokota, 2000) ; and L-myc, polymorphisms of which genes are 
associated with altered susceptibility to CAD (especially in 
relation to tolerance to smoking) and cancers (Togo, 2000) . 

10 Genes which encoded proteins associate with immunological 
susceptibility 

Specific examples of genes of category k for which 
information relating to polymorphisms may be used in the 

15 present invention include genes encoding HLA Class 2 

molecules, polymorphisms of which genes are associated with 
altered susceptibility to cervical cancer and human papilloma 
virus (HPV) infection (Maciag, 2000); T-lymphocyte associated 
antigen 4 (CTLA-4), polymorphisms of which genes are 

20 associated with altered susceptibility to liver disease 

(Argawal, 2000); interleukin 1 (IL-1), polymorphisms of which 
are associated with cardiovascular disease and periodontal 
disease (macaiag, 2000; Nakajima, 2000); IL-4, polymorphisms 
of which genes are associated with altered susceptibility to 

25 atopy and asthma (Rosa-Rosa, 1999) ; IL-3, polymorphisms of 
which genes are associated with altered susceptibility to 
atopy and asthma (Rosa-Rosa, 1999) ; IL-6, polymorphisms of 
which genes are associated with altered susceptibility to 
osteoporosis; and IgA, polymorphisms of which genes are 

30 associated with altered susceptibility to COPD (Miki, 1999) . 

Detection of Polymorphisms 
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As described above, the method of the invention may include 
the step of analysing a DNA sample of a human subject in 
order to construct the dataset to be used in the method of 
the invention. 

5 

Testing of Samples 
Collection of Tissue Samples 

10 DNA for analysis using the method or arrays of the invention 
can be isolated from any suitable client or patient cell 
sample. For convenience, it is preferred that the DNA is 
isolated from cheek (buccal) cells. This enables easy and 
painless collection of cells by the client, with the 

15 convenience of being able to post the sample to the provider 
of the genetic test without the problems associated with 
posting a liquid sample. 

Cells may be isolated from the inside of the mouth using a 
20 disposable scraping device with a plastic or paper matrix 
"brush", for example, the C.E.P. Swab™ (Life Technologies 
Ltd. , UK) . Cells are deposited onto the matrix upon gentle 
abrasion of the inner cheek, resulting in the collection of 
approximately 2000 cells (Aron, 1994) . The paper brush can 
25 then be left to dry completely, ejected from the handle 

placed into a microcentrifuge tube and posted by the client 
or patient to the provider of the genetic test. 

Isolation of DNA from Samples 

30 

DNA from the cell samples can be isolated using conventional 
procedures. For example DNA may be immobilised onto filters, 
column matrices, or magnetic beads. Numerous commercial kits 
such as the Qiagen QIAamp kit (Quiagen, Crawley, UK) may be 
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used. Briefly, the cell sample may be placed in a 
microcentrifuge tube and combined with Proteinase K, mixed, 
and allowed to incubate to lyse the cells. Ethanol is then 
added and the lysate is transferred to a QIAamp spin column 
5 from which DNA is eluted after several washings. 

The amount of DNA isolated by the particular method used may 
be quantified to ensure that sufficient DNA is available for 
the assay and to determine the dilution required to achieve 

10 the desired concentration of DNA for PCR amplification. For 
example, the desired target DNA concentration may be in the 
range 10 ng and 50 ng . DNA concentrations outside this 
range may impact the PCR amplification of the individual 
alleles and thus impact the sensitivity and selectivity of 

15 the polymorphism determination step. 

The quantity of DNA obtained from a sample may be determined 
using any suitable technique. Such techniques are well known 
to persons skilled in the art and include UV (Maniatis, 1982) 

20 or fluorescence based methods. As UV methods may suffer from 
the interfering absorbance caused by contaminating molecules 
such as nucleotides, RNA, EDTA and phenol and the dynamic 
range and sensitivity of this technique is not as great as 
that of fluorescent methods, fluorescence methods are 

25 preferred. Commercially available fluorescence based kits 

such as the PicoGreen dsDNA Quantification {Molecular Probes, 
Eugene, Oregon, USA) . 

Primers 

30 

Prior to the testing of a sample, the nucleic acids in the 
sample may be selectively amplified, for example using 
Polymerase Chain Reaction (PCR) amplification, as described 
in U.S. patent numbers 4,683,202 AND 4,683,195. 
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Preferred primers for use in the present invention are from 
18 to 23 nucleotides in length, without internal homology or 
primer-primer homology. 

5 

Furthermore, to ensure amplification of the region of 
interest and specificity, the two primers of a pair are 
preferably selected to hybridise to either side of the region 
of interest so that about 150 bases in length are amplified, 
10 although amplification of shorter and longer fragments may 

also be used. Ideally, the site of polymorphism should be at 
or near the centre of the region amplified. 

Table 1 provides preferred examples of primer pairs which may 
15 be used in the invention, particularly when the Taqman® assay 
is used in the method of the invention. The primers are 
shown together with the gene targets and preferred examples 
of the wt probes and polymorphism probes used in the Taqman® 
assay for each gene target. 

20 

Table 2 provides preferred examples of the primer pairs which 
may be used in the invention together with the gene targets 
and the size of the fragment isolated using the primers, 
which they amplify. 

25 

The primers and primer pairs form a further aspect of the 
invention. Therefore the invention provides a primer having a 
sequence selected from SEQ ID NO: 86-99, 104-163. In another 
aspect, there is provided a primer pair comprising primers 
30 having SEQ ID NO:n, where n is an even number from 86 -98 or 
104-162 in conjunction with a primer having SEQ ID NO:(n+l). 

In a preferred embodiment of the invention, multiplexed 
amplification of a number of sequences are envisioned in 
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order to allow determination of the presence of a plurality 
of polymorphisms using, for example the DNA array method. 
Therefore, primer pairs to be used in the same reaction are 
preferably selected by position, similarity of melting 
5 temperature, internal stability, absence of internal homology 
or homology to each other to prevent self -hybridisation or 
hybridisation with other primers and lack of propensity of 
each primer to form a stable hairpin loop structure. Thus, 
the sets of primer pairs to be coamplified together 
10 preferably have approximately the same thermal profile, so 
that they can be effectively coamplified together. This may 
be achieved by having groups of primer pairs with 
approximately the same length and the same G/C content. 

15 Therefore in a further aspect of the invention, there is 

provided a primer set comprising at least 5, more preferably 
10, 15 primer pairs selected from SEQ ID NO: 86-121. 



Table 1 

20 



Gene 


Forward 
primer 


Reverse 
primer 


WT Probe 


Polymorphism 
probe 












1. CYP1A1 










A4 8 8 9G 


CATGGGCAAGCGGAAG 
TG 

(SEQ ID NO:122) 


CAGGATAGCCAGGAA 
GAGAAAGAC 
(SEQ ID NO: 123) 


CGGTGAGACCaTTG 
(SEQ ID NO:164) 


CGGTGAGACCgTTG 
(SEQ ID NO: 165) 


T6235C 


AGACAGGGTCCCCAGG 
TCAT 

(SEQ ID NO: 124) 


CAGAGGCTGAGGTGG 
GAGAA 

(SEQ ID NO: 125) 


CTCCACCTCCtGGG 
(SEQ ID NO:166) 


CTCCACCTCCcGGG 
(SEQ ID NO: 167) 












2. NATl 










G445A 


GGAGTTAATTTCTGGG 
AAGGATCAG 
(SEQ ID NO:126) 


TGGTCTAGATACCAG 
AATCCATTCTCTT 
(SEQ ID NO: 127) 


GCCTTGTgTCTTC 
(SEQ ID NO: 168) 


TGCCTTGTaTCTTC 
(SEQ ID NO: 169) 


G4 5 9A 


GGCAGCCTCTGGAGTT 
AATTTCT 

(SEQ ID NO:128) 


TTCCCTTCTGATTTG 
GTCTAGATACC 
(SEQ ID NO: 129) 


CGTTTGACgGAAGAG 
(SEQ ID NO: 170) 


CGTTTGACaGAAGAG 
(SEQ ID NO: 171) 


G560A 


GGGAACAGTACATT CC 
AAATGAAGA 
(SEQ ID NO: 130) 


TGTTCGAGGCTTAAG 
AGTAAAGGAGT 
(SEQ ID NO:131) 


AAT AC C g AAAAAT C 
(SEQ ID NO: 172) 


CAAATACCaAAAAAT 
(SEQ ID NO: 173) 


T640G 


AACAATTGAAGATTTT 
GAGTCTATGAATACA 
(SEQ ID NO: 132) 


TCTGCAAGGAACAAA 
ATGATTTACTAGT 
(SEQ ID NO:133) 


CAT CT CC At CAT CT G 
(SEQ ID N0:174) 


ACATCTCCAgCATCT 
(SEQ ID NO: 175) 


T1088A 


G AAAC AT AAC C AC AAA 
CCTTTTCAAA 
(SEQ ID NO:134) 


AAATCACCAATTTCC 
AAGATAACCA 
(SEQ ID NO: 135) 


CCATCTTTAAAATACA 
TTTaTTA 
(SEQ ID NO:203) 


CATCTTTAAAATACA 
TTTtTTA 

(SEQ ID NO:204) 
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Gene 


Forward 

primer 


Reverse 
primer 


WT Probe 


Polymorphism 
probe 












C1095A 


AAACATAACCACAAAC 
CTTTTCAAATAAT 
{ SEQ ID NO: 136) 


AAATCACCAATTTCC 
AAGATAACCA 
(SEQ ID NO: 137) 


GCCATCTTTAAAAgAC 
AT 

(SEQ ID NO: 176) 


GCCATCTTTAAAAtA 
CATT 

(SEQ ID NO: 177) 












3. NAT 2 












AAT CAACTTCTGTACT 
GGGCTCTGA 
(SEQ ID NO: 138) 


CCATGCCAGTGCTGT 
ATTTGTT 

(SEQ ID NO: 139) 


AGGGTATTTTTAcATC 
CCT 

(SEQ ID NO: 178) 


AGGGTATTTTTAtAT 
CCCTC 

(SEQ ID NO:179) 




TGCATTTTCTGCTTGA 
CAGAAGA 

(SEQ ID NO: 140) 


TTTGTTTGTAATATA 
CTGCTCTCTCCTGAT 
(SEQ ID NO: 141) 


TCTGGTACCTGGACCA 
A 

(SEQ ID NO:180) 


AATCTGGTACtTGGA 
CCAA 

(SEQ ID NO:181) 


G>A 


GCCAAAGAAGAAACAC 
CAAAAAAT 
(SEQ ID NO:142) 


AAATGATGTGGTTAT 
AAATGAAGATGTT G 
(SEQ ID NO: 143) 


TGAACCTCgAACAAT 
(SEQ ID NO:182) 


TTGAACCTCaAACAA 
(SEQ ID NO:183) 


G>A2 


AAGAGGTTGAAGAAGT 
GCTGAAAAATAT 
(SEQ ID NO: 144) 


ATACATACACAAGGG 
TTTATTTTGTTCCT 
( SEQ ID NO : 14 5 ) 


CTGGTGATGgATCC 
(SEQ ID NO : 18 4 ) 


CTGGTGATGaATCC 
(SEQ ID NO:185) 












4 . GSTM1 










C534G 


GTTCCAGCCCACACAT 
TCTTG 

(SEQ ID NO: 146) 


CGGGAGATGAAGT CC 
TTCAGATT 
(SEQ ID NO: 147 ) 


CAAGCAgTTGGGC 
(SEQ ID NO: 186) 


CAAGCAcTTGGGC 
(SEQ ID NO: 187) 












5. GSTP1 










A313G 


CCTGGTGGACATGGTG 
AATG 

(SEQ ID NO: 14 8) 


GCAGATGCTCACATA 
GTTGGTGTAG 
(SEQ ID NO: 149) 


GCAAATACaTCTCCCT 
(SEQ ID NO: 188 ) 


GCAAATACgTCTCCC 
T 

(SEQ ID NO: 189) 


C341T 


GGGATGAGAGTAGGAT 
GATACATGGT 
(SEQ ID NO: 150) 


GGGT CT CAAAAGGCT 
TCAGTTG 

(SEQ ID NO: 151) 


CCTTGCCCgCCTC 
( SEQ I D NO : 1 90 ) 


CTTGCCCaCCTCC 












6. GSTT1 


T C AT T CT GAAGG CCAA 
GGACTT 

(SEQ ID NO: 152) 


CAGGGCATCAGCTTC 
TGCTT 

(SEQ ID NO: 153) 


CCTGCAGACCCC 
(SEQ ID NO: 192) 


N/A 












7. MnSOD 










T-28C 


GGCTGTGCTTTCTCGT 
CTTCA 

(SEQ ID NO: 154) 


TTCTGCCTGGAGCCC 
AGAT 

(SEQ ID NO:155) 


ACCCCAAAaCCGGA 
(SEQ ID NO: 193) 


ACCCCAAAgCCGGA 
(SEQ ID NO: 194) 


T175C 


GTGTTGCATTTACTTC 
AGGAGATGTT 
(SEQ ID NO:156) 


TCCAGAAAATGCTAT 
GATTGATATGAC 
(SEQ ID NO:157) 


AGCCCAGAtAGCT 
(SEQ ID NO:195) 


AGCCCAGAcAGCT 
(SEQ ID NO: 196) 












8 . MTHFR 










C677T 


GACCTGAAGCACTTGA 
AGGAGAA 

(SEQ ID NO: 158 ) 


T CAAAGAAAAGCTGC 
GTGATGA 

(SEQ ID NO: 159) 


AAATCGgCTCCCGC 
(SEQ ID NO: 197) 


AAATCGaCTCCCGCA 
GA 

(SEQ ID NO: 198) 


A1298C 


AAGAGCAAGTCCCCCA 
(SEQ ID NO: 160) 


CTTTGTGACCATTCC 
(SEQ ID NO: 161) 


CAGT GAAGa AAGT GT C 


AGTGAAGcAAGTGTC 
(SEQ ID NO:200) 












9. ALDH2 










G1156A 


CCCTTTGGTGGCTACA 
AGATGT 

(SEQ ID NO: 162) 


AGACCCTCAAGCCCC 
AACA 

(SEQ ID NO: 163) 


TCACAGTTTTCACTTc 
AGTGT 

(SEQ ID NO-.201) 


TCACAGTTTTCACTT 
tAGTGT 

(SEQ ID NO:202) 
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Table 2: Examples of Primer pairs 



Gene 


Primer 


Forward 


Reverse 


Size 












NAT1 


1 


N/A same genotype as 
set 3 








2 


N/A same genotype as 
set 3 








3 


5'ggg ttt gga cgc tea 
tac c{SEQ ID NO: 86) 


5'aat gta ctg ttc cct tct 
gat ttg g (SEQ ID NO: 87) 


141bp 




4b 


5' tec gtt tga egg aag 
aga at (SEQ ID NO: 88) 


5'ggg tct gca agg aac aaa 
at (SEQ ID NO: 89 ) 


234bp 




5 


5'gaa aca taa cca caa 
acc (SEQ ID NO: 90) 


5' caa caa taa acc aac att 
aaa age (SEQ ID NO: 91) 


241bp 













NAT2 


1 


5' act tct gta ctg ggc 
tct gac c (SEQ ID NO: 
92) 


5' gca teg aca atg taa ttc 
ctg c (SEQ ID NO: 93) 


15 0bp 




2 


5'aat aca gca ctg gca 
tgg (SEQ ID NO: 94) 


5' caa gga aca aaa tga tgt 
gg (SEQ ID NO: 95) 


380bp 




3 


5'gtg ggc ttc ate etc 
acc ta (SEQ ID NO: 96) 


5' ggg tga tac ata cac aag 
ggt tt (SEQ ID NO: 97) 


20 9bp 












GSTM1 


1 


5'cag ccc aca cat tct 
tgg (SEQ ID NO: 98) 


5' aag egg gag atg aag tec 
(SEQ ID NO: 99) 


19 6bp 












MTHFR 


1 


5'agg tta ccc caa agg 
cca cc (SEQ ID NO: 100) 


5' gca agt gat gee cat gtc g 
(SEQ ID NO: 101) 


166bp 




2 


5' tct tct acc tga aga 

gca agt cc (SEQ ID NO: 
102) 


5' caa gtc act ttg tga cca 
ttc c (SEQ ID NO: 103) 


14 2bp 












CYP1A1 


lb 


5'cct gaa ctg cca ctt 
cag c (SEQ ID NO: 104) 


5' cca gga aga gaa aga cct 
cc (SEQ ID NO: 105) 


19 9bp 




2 


5' ccc att ctg tgt ttg 
ggt ttt t (SEQ ID NO: 
106) 


5' aga ggc tga ggt ggg aga 
at (SEQ ID NO: 107) 


213bp 
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Gene 


Primer 
Set 


Forward 


Reverse 


Size 












GSTT1 


1 


5' gag gtc att ctg aag 
gcc aag g (SEQ ID NO: 


5'ttt gtg gac tgc tga gga 
eg (SEQ ID NO: 10 9) 


133bp 












P- 

actin 


lb 


5' tec tea gat cat tgc 
tec { SEQ ID NO: 110) 


5' taa cgc aac taa gtc ata 
gtc c (SEQ ID NO: 111) 














MnSOD 


1 


5'ggc tgt get ttc teg 
tct tc (SEQ ID NO: 112) 


5 'ggt gac gtt cag gtt gtt 
ca (SEQ ID NO: 113) 


194bp 




2 


5' aca gtg gtt gaa aaa 
gta gg (SEQ ID NO: 114) 


5'caa aat gta gat aag ggt 
gc (SEQ ID NO: 115) 


205bp 












ALDH2 


1 


5'ttg gtg get aca aga 
tgt eg (SEQ ID NO: 116) 


5' agg tec tga act tec age 
ag ( SEQ ID NO: 117) 














GSTP1 


1 


5' get eta tgg gaa gga 
cca gc (SEQ ID NO: 118) 


5' aag cca cct gag ggg taa 
gg (SEQ ID NO: 119) 


192bp 




2 


5'cag cag ggt etc aaa 
agg (SEQ ID NO: 120) 


5' gat gga cag gca gaa tgg 
(SEQ ID NO: 121) 


2 5 0bp 



Having obtained a sample of DNA, preferably with amplified 
5 regions of interest, individual polymorphisms may be 
identified. Identification of the markers for the 
polymorphisms involves the discriminative detection of 
allelic forms of the same gene that differ by nucleotide 
substitution, or in the case of some genes, for example the 
10 GSTM1 and GSTT1 genes, deletion of the entire gene. Methods 
for the detection of known nucleotide differences are well 
known to the skilled person. These may include, but are not 
limited to: 
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a. Hybridization with allele-specif ic oligonucleotides 
(ASO) , (Wallace, 1981; Ikuta, 1987; Nickerson, 1990, 
Varlaan, 1986, Saiki, 1989 and Zhang, 1991) . 



b. Allele specific PCR, (Newton 1989, Gibbs, 1989) . 



5 



c. Solid-phase minisequencing (Syvanen, 1993) . 



d. Oligonucleotide ligation assay (OLA) (Wu, 1989, 
Barany, 1991; Abravaya, 1995) . 



10 



e. The 5' fluorogenic nuclease assay (Holland, 1991 & 
1992, Lee, 1998, US patents 4,683,202, 4,683,195, 
5,723,591 and 5,801,155). 



f. Restriction fragment length polymorphism (RFLP) , 
(Donis-Keller, 1987). 

In a preferred embodiment, the genetic loci are assessed via 
15 a specialised type of PCR used to detect polymorphisms, 

commonly referred to as the Taqman® assay and performed using 
an AB7700 instrument (Applied Biosystems, Warrington, UK) . 
In this method, a probe is synthesised which hybridises to a 
region of interest containing the polymorphism. The probe 
20 contains three modifications: a fluorescent reporter 

molecule, a fluorescent quencher molecule and a minor groove 
binding chemical to enhance binding to the genomic DNA 
strand. The probe may be bound to either strand of DNA. For 
example, in the case of binding to the coding strand, when 
25 the Taq polymerase enzyme begins to synthesise DNA from the 
5' upstream primer, the polymerase will encounter the probe 
and begin to remove bases from the probe one at a time using 



fluorescent reporter molecule is removed, the fluorescent 
30 molecule is no longer quenched by the quencher molecule and 
the molecule will begin to fluoresce. This type of reaction 
can only take place if the probe has hybridised perfectly to 
the matched genomic sequence. As successive cycles of 
amplification take place, i.e. more probes and primers are 




When the base bound to the 
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bound to the DNA present in the reaction mixture, the amount 
of fluorescence will increase and a positive result will be 
detected. If the genomic DNA does not have a sequence that 
matches the probe perfectly, no fluorescent signal is 
5 detected. 

Examples of oligonucleotide probes which may be used in the 
invention, particularly when the Taqman® assay is used in the 
method of the invention together with primers which may be 
10 used. These oligonucleotide probes form another aspect of 
the present invention. 

Therefore in a further aspect of the invention, there is 
provided an oligonucleotide having a sequence selected from 
15 SEQ ID NO: 164-202. The invention further provides a set of 
oligonucleotides comprising at least 5, 10, 20, 30, 40, 50, 
60 or 70 oligonucleotides selected from the group comprising 
SEQ ID NO: 164-202 . 

2 0 Arrays 

In a preferred embodiment of the invention, hybridisation 
with allele specific oligonucleotides is conveniently carried 
out using oligonucleotide arrays, preferably microarrays, to 
determine the presence of particular polymorphisms. 

25 

Such microarrays allow miniaturisation of assays, e.g. making 
use of binding agents (such as nucleic acid sequences) 
immobilised in small, discrete locations (microspots) and/or 
as arrays on solid supports or on diagnostic chips. These 
30 approaches can be particularly valuable as they can provide 
great sensitivity (particularly through the use of 
fluorescent labelled reagents), require only very small 
amounts of biological sample from individuals being tested 
and allow a variety of separate assays to be carried out 
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simultaneously. This latter advantage can be useful as it 
provides an assay for different a number of polymorphisms of 
one or more genes to be carried out using a single sample. 
Examples of techniques enabling this miniaturised technology 
are provided in WO84/01031, WO88/1058, WO89/01157, W093/8472, 
W095/18376/ W095/18377, W095/24649 and EP-A-0373203, the 
subject matter of which are herein incorporated by reference. 

DNA microarrays have been shown to provide appropriate 
discrimination for polymorphism detection. Yershov, 1996; 
Cheung, 1999 and Schena 1999 have described the principles of 
the technique. In brief, the DNA microarray may be generated 
using oligonucleotides that have been selected to hybridise 
with the specific target polymorphism. These oligonucleotides 
may be applied by a robot onto a predetermined location of a 
glass slide, e.g. at predetermined X, Y cartesian coordinates, 
and immobilised. The PCR product (e.g. fluorescent ly labelled 
RNA or DNA) is introduced on to the DNA microarray and a 
hybridisation reaction conducted so that sample RNA or DNA 
binds to complementary sequences of oligonucleotides in a 
sequence-specific manner, and allow unbound material to be 
washed away. Gene target polymorphisms can thus be detected 
by their ability to bind to complementary oligonucleotides on 
the array and produce a signal. The absence of a fluorescent 
signal for a specific oligonucleotide probe indicates that 
the client does not have the corresponding polymorphism. Of 
course, the method is not limited to the use of fluorescence 
labelling but may use other suitable labels known in the art. 
the fluorescence at each coordinate can be read using a 
suitable automated detector in order to correlate each 
fluorescence signal with a particular oligonucleotide. 

Oligonucleotides for use in the array may be selected to span 
the site of the polymorphism, each oligonucleotide comprising 
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one of the following at a central location within the 
sequence : 

a. wild-type or normal base at the position of interest in 
the leading strand 

b. wild-type or normal base at the position of interest in 
the lag (non-coding) strand 

c. altered base at the position of interest in the leading 
strand 

d. altered complementary base at the position of interest 
in the lag strand 

The arrays used in the present method form another 
independent aspect of the present invention. Arrays of the 
invention comprise a set of two or more oligonucleotides, 
each oligonucleotide being specific to a sequence comprising 
one or more polymorphisms of a gene selected from the group 
comprising categories a-k as defined above. 

Preferably, the array will comprise oligonucleotides each 
being specific to a sequence comprising one or more 
polymorphisms of an individual gene of at least two different 
categories a-k as defined above, for example a+b (i.e. at 
least one oligonucleotide specific for a sequence comprising 
one or more polymorphisms of a first gene, the first gene 
being of category a and at least one oligonucleotide specific 
for a sequence comprising one or more polymorphisms of a 
second gene, the second gene being of category b) , a+c, a+d, 
a+e, a+f, a+g, a+h, a+i, a+j , a+k, b+c, b+d, b+e etc., c+d, 
c+e etc, d+e, d+f etc, e+f, e+g etc, f+g, f+h etc., g+h, g+i, 
g+k, h+i, h+k. Where the array comprises two or more 
oligonucleotides, it is preferred that at least one of the 
oligonucleotides is an oligonucleotide specific for a 
sequence of a polymorphism of a gene of category d, due to 



42 



the central role of micronutrients in the maintenance of 
proper cellular growth and DNA repair, and due to the 
association of micronut rient metabolism or utilisation 
disorders with several different types of diseases (Ames 
5 1999; Perera, 2000; Potter, 2000) . More preferably, the array 
will comprise oligonucleotides each being specific to a 
sequence comprising one or more polymorphisms of an 
individual gene of at least three different categories a-k as 
defined above, for example, a+b+c, a+b+d, a+b+e, a+b+f, 

10 a+b+g, a+b+h, a+b+i, a+b+j , a+b + k a + c-fd, a+c+e etc, a+d+e, 
etc, b+c+d, etc, c+d+e etc, d+e+f etc, and all other 
combinations of three categories. Where the array comprises 
three or more oligonucleotides, it is preferred that at least 
two of the oligonucleotides are oligonucleotides specific for 

15 a sequence of a polymorphism of a gene of categories d and e. 
Information relating to polymorphisms present in both of 
these categories is particularly useful due to the effects of 
alcohol consumption and metabolism on the efficiency of 
enzymes related to micronutrient metabolism and 

20 utilisation. (Ulrich, 1999) . In a further preferred 
embodiment where the array comprises three or more 
oligonucleotides, it is preferred that at least two of the 
oligonucleotides are oligonucleotides specific for a sequence 
of a polymorphism of a gene of c categories a and b due to 

25 the close interaction of Phase I and Phase II enzymes in the 
metabolism of xenobiotics. Even more preferably, the array 
will comprise oligonucleotides each being specific to a 
sequence comprising one or more polymorphisms of an 
individual gene of at least four different categories a-k as 

30 defined above, for example, a+b+c+d, a+b+c+e, a+b+d+e, 

a+c+d+e, b+c+d+e etc. Where the array comprises four or more 
oligonucleotides, it is preferred that at least three of the 
oligonucleotides are oligonucleotides specific for a sequence 
of a polymorphism of a gene of categories d and e and f 



Information relating to polymorphisms present in these three 
categories is particularly useful due to the strong 
correlation of polymorphisms of these alleles with coronary 
artery disease due to the combined effects of altered 
micronutrient utilisation, affected adversely by alcohol 
metabolism, together with imbalances in fat and cholesterol 
metabolism. Where the array comprises five or more 
oligonucleotides, it is preferred that at least four of the 
oligonucleotides are oligonucleotides specific for a sequence 
of a polymorphism of a gene of categories a, b, d and e. 
Information relating to polymorphisms present in these four 
categories is particularly useful due to the combined effects 
of micronutrients utilisation, alcohol metabolism, Phase 1 
metabolism of xenobiotics and Phase II metabolism on the 
further metabolism and excretion of potentially harmful 
metabolites produced in the body (Taningher, 1999; Ulrich, 
1999) . Similarly, the array may comprise oligonucleotides 
each being specific to a sequence comprising one or more 
polymorphisms of an individual gene of at least five, for 
example a, b, d, e and f, six, seven, eight, nine or ten 
different categories a-k as defined above. 

Most preferably, the array will comprise oligonucleotides 
each being specific to a sequence comprising one or more 
polymorphisms of an individual gene of each of categories a-k 
as defined above. 

In one preferred embodiment, the array comprises 
oligonucleotides each being specific to a sequence comprising 
one or more polymorphisms of individual genes, the individual 
genes comprising each member of the group comprising genes 
encoding cytochrome P450 monooxygenase , N-acetyltrans f erase 
1, N-acetyltransferase 2, glutathione-S-transf erase , 
manganese superoxide dismutase, 5,10- 
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methylenetetrahydrofolatereductase and alcohol dehydrogenase 
2 enzymes, genetic loci of genes encoding each of the 
cytochrome P450 monooxygenase , N-acetylt rans f erase 1, N- 
acetyltransf erase 2, glutathione-S-transf erase, manganese 
5 superoxide dismutase, 5 , 10-methylenetetrahydrof olatereductase 
and alcohol dehydrogenase 2 enzymes. In a more preferred 
embodiment the array further comprises oligonucleotides 
specific for one or more alleles of the genetic loci of genes 
encoding one or more, preferably each of epoxide hydrolase 

10 (EH), NADPH-quinone reductase (NQ01), paraxonaoase (PONl), 
myeloperoxidase (MPO) , alcohol dehydrogenase 1, alcohol 
dehydrogenase 3, cholesteryl ester transfer protein, 
apolipoprotein A IV, apol ipoprotein E, apolipoprotein C III, 
angiotensin, factor VII, prothrombin 20210, p-f ibrinogen, 

15 heme -oxygenase-1 , a-antitrypsin, SPINKl, A-aminolevulinacid 
dehydratase, interleukin 1, interleukin 1, vitamin D 
receptor, Bl kinin receptor, cystathionine-beta-synthase , 
methionine synthase (B12 MS), 5-HT transporter, transforming 
growth factor beta 1 (TGF(31), L-myc, HLA Class 2 molecules, 

20 T-lymphocyte associated antigen 4 (CTLA-4), interleukin 4, 
interleukin 3, interleukin 6, IgA, and/or galactose 
metabolism gene GALT . 

In preferred arrays, the oligonucleotides in the array 
25 comprise at least 5, 10, 20, 30, 40, 50, 60 or 70 

oligonucleotides selected from the group comprising SEQ ID 
NO:l - SEQ ID NO: 85 illustrated in TABLE 3 which shows 
preferred oligonucleotides listed in the right column with 
the primer set used to amplify the appropriate fragments of 
30 sample DNA listed in the left column. 

In a preferred embodiment the array will comprise all of the 
oligonucleotides SEQ ID NO:l - 85. 
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Table 3 



Gene Target 


25 nt sequence 






1 . CYP1A1 




Primer setl A4889G wt-lead 


5' ate ggt gag acc Att gec cgc tgg g 
(SEQ ID NO: 1) 


Primer setl A4889G wt-lag 


5' ccc age ggg caa Tgg tct cac cga t 
(SEQ ID NO: 2) 


Primer setl A4 88 9G polymorph- 
lead 


5' ate ggt gag acc Gtt gec cgc tgg g 
(SEQ ID NO: 3) 


Primer setl A4889G polymorph- lag 


5' ccc age ggg caa Cgg tct cac cga t 
(SEQ ID NO: 4) 






Primer set2 T6235C wt-lead 


5' acc tec acc tec Tgg get cac acg a 
(SEQ ID NO: 5) 


Primer set2 T6235C wt-lag 


5' teg tgr gag ccc Agg agg tgg agg t 
(SEQ ID NO: 6) 


Primer set2 T6235C polymorph-lead 


5' acc tec acc tec Cgg get cac acg a 
(SEQ ID NO: 7) 


Primer set2 T6235C polymorph-lag 


5' teg tgt gag ccc Ggg agg tgg agg t 
(SEQ ID NO: 8) 






2 . NAT1 




Primer setl 


N/A 


Primer set2 


N/A 


Primer set 3 G4 45A wt-lead 


5'cag gtg cct tgt Gtc ttc cgt ttg a 
(SEQ ID NO: 9) 


Primer set3 G4 45A wt-lag 


5' tea aac gga aga Cac aag gca cct g 
( SEQ ID NO: 10) 


Primer set3 G445A polymorph-lead 


5' cag gtg cct tgt Ate ttc cgt ttg a 
(SEQ ID NO: 11) 


Primer set3 G445A polymorph-lag 


5' tea aac gga aga Tac aag gca cct g 
(SEQ ID NO: 12) 






Primer set3 G459A wt-lead 


5' ctt ccg ttt gac Gga aga gaa tgg a 
(SEQ ID NO: 13) 


Primer set3 G459A wt-lag 


5' tec att etc ttc Cgt caa acg gaa g 
(SEQ ID NO: 14) 


Primer set3 G459A polymorph-lead 


5' ctt ccg ttt gac Aga aga gaa tgg a 



46 



Gene Target 


25 nt seguence 








(SEQ ID NO: 15) 


Primer set3 G459A polymorph-lag 


5' tec att etc ttc Tgt caa acg gaa 
g (SEQ ID NO: 16) 






Primer set4 G560A wt-lead 


5' aca gca aat acc Gaa aaa tct act c 
( SEQ ID NO : 17 ) 


Primer set4 G560A wt-lag 


5' gag tag att ttt Cgg tat ttg ctg t 
( SEQ I D NO : 18) 


Primer set4 G560A polyraorph-lead 


b' aca gca aat acc Aaa aaa tct act c 
(SEQ ID NO: 19) 


Primer set4 G560A polymorph-lag 


5' gag tag att ttt Tec tat ttg ctg t 

(SEQ ID NO: 20) 






Primer set5T1088A wt-lead*a 


5' taa taa taa taa Taa atg tct ttt a 
( SEQ ID NO : 21) 


Primer set5 T1088A wt-lag*a 


5' taa aag aca ttt Att att art att a 
(SEQ ID NO: 22) 


Primer set5T1088A wt-lead*b 


5' taa taa taa taa Taa atg tat ttt a 
(SEQ ID NO: 23) 


Primer set5 T1088A wt-lag*b 


5' taa aat aca ttt Att att tta att a 
(SEQ ID NO: 24) 


Primer set5 T1088Apolymorph- 
lead*a 


5' taa taa taa taa Aaa atg tct ttt a 
(SEQ ID NO: 25) 


Primer set5 T1088A polymorph- 
lag*a 


5' taa aag aca ttt Ttt att tta att a 
(SEQ ID NO: 26) 


Primer set5 T1088Apolymorph- 
lead*b 


5' taa taa taa taa Aaa atg tat ttt a 


Primer set5 T1088A polymorph- 
lag*b 


5' taa aat aca ttt Ttt att tta att a 
(SEQ ID NO: 27) 


^redundancy due to adjacent 
polymorphisms 




Primer set5 C1095A wt-lead*a 


5' aat aat aaa tgt Ctt tta aag atg g 
(SEQ ID NO: 28) 


Primer set5 C1095A wt-lag*a 


5 r cca tct tta aaa Gac att tat tat t 
(SEQ ID NO: 29) 


Primer set5 C1095A wt-lead*b 


5' aat aaa aaa tgt Ctt tta aag atg g 
(SEQ ID NO: 30) 


Primer set5 C1095A wt-lag*b 


5' cca tct tta aaa Gac att ttt tat t 



47 



Gene Target 


25 nt sequence 








(SEQ ID NO: 31) 


Primer set5 C10 95Apolymorph- 
lead*a 


5' aat aat aaa tgt Att tta aag atg g 
(SEQ ID NO: 32) 


Primer set5 C1095A polymorph- 
lag*a 


5' cca tct tta aaa Tac att tat tat t 
(SEQ ID NO: 33) 


Primerset5 C10 95Apolymorph-lead*b 


5' aat aaa aaa tgt Att tta aag atg g 
(SEQ ID NO: 34) 


Primer set5 C1095A polymorph- 
lag*b 


5' cca tct tta aaa Tac att ttt tat t 
(SEQ ID NO: 35) 


* redundancy due to adjacent 
polymorphisms 






3 . NAT 2 




Primer setl C282T wt-lead 


5' agg gta ttt tta Cat ccc tec agt t 
(SEQ ID NO: 36) 


Primer setl C282T wt-lag 


5' aac tgg agg gat Gta aaa ata ccc t 
( SEQ ID NO: 37) 


Primer setl C282T polymorph-lead 


5' agg gta ttt tta Tat ccc tec agt t 
(SEQ ID NO: 38) 


Primer setl C282T polymorph-lag 


5' aac tgg agg gat Ata aaa ata ccc t 
(SEQ ID NO: 39) 






Primer set2 C481T wt-lead 


5' gga ate tgg tac Ctg gac caa ate a 
(SEQ ID NO: 40) 


Primer set2 C481T wt-lag 


5' tga ttt ggt cca Ggt acc aga ttc c 
(SEQ ID NO: 41) 


Primer set2 C4 81T polymorph-lead 


5' gga ate tgg tac Ttg gac caa ate a 
(SEQ ID NO: 42) 


Primer set2 C481T polymorph-lag 


5' tga ttt ggt cca Agt acc aga ttc c 
(SEQ ID NO: 43) 






Primer set2 G590A wt-lead 


5' cgc ttg aac etc Gaa caa ttg aag a 
(SEQ ID NO: 44) 


Primer set2 G590A wt-lag 


5' tct tea att gtt Cga ggt tea age g 
(SEQ ID NO: 45) 


Primer set2 G590A polymorph-lead 


5' cgc ttg aac etc Aaa caa ttg aag a 
(SEQ ID NO: 4 6) 


Primer set2 G590A polymorph-lag 


5' tct tea att gtt Tga ggt tea age g 
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(SEQ ID NO: 47) 






Primer set3 G857A wt-lead 


5' aac ctg gtg atg Gat ccc tta eta t 
( SEQ ID NO: 48) 


Primer set3 G857A wt-lag 


5' ata gta agg gat Cca tea cca ggt t 
(SEQ ID NO: 49) 


Primer set3 G857A polymorph-lead 


5' aac ctg gtg atg Aat ccc tta eta t 
(SEQ ID NO: 50) 


Primer set3 G857A polymorph-lead 


5' ata gta agg gat Tea tea cca ggt t 
(SEQ ID NO: 51) 






4 . GSTM1 




Primer setl wt-lead 


5' get aca ttg ccc gca age aca acc t 
(SEQ ID NO: 52) 


Primer setl wt-lag 


5' agg ttg tgc ttg egg gca atg tag c 
( SEQ ID NO: 53) 






5 . GSTP1 




Primer setl A313G wt-lead 


5' cgc tgc aaa tac Ate tec etc ate t 
(SEQ ID NO: 54) 


Primer setl A313G wt-lag 


5' aga tga ggg aga Tgt att tgc age g 
(SEQ ID NO: 55) 


Primer setl A313G polymorph-lead 


5' cgc tgc aaa tac Gtc tec etc ate t 
(SEQ ID NO: 56) 


Primer setl A313G polymorph- lag 


5' aga tga ggg aga Cgt att tgc age g 
(SEQ ID NO: 57) 


Primer set2 C341T wt-lead 


5' tct ggc agg agg Cgg gca agg atg a 
{ SEQ ID NO: 58) 


Primer set2 C341T wt-lag 


5' tea tec ttg ccc Gcc tec tgc cag a 
(SEQ ID NO: 59) 


Primer set2 C341T polymorph-lead 


5' tct ggc agg agg Tgg gca agg atg a 
(SEQ ID NO: 60) 


Primer set2 C341T polymorph-lag 


5' tea tec ttg ccc Acc tec tgc cag a 
(SEQ ID NO: 61) 






6. GSTT1 




Primer setl wt-lead 


5' acc ata aag cag aag ctg atg ccc t 
(SEQ ID NO: 62) 


Primer set2 wt-lag 


5' agg gca tea get tct get tta tgg t 
(SEQ ID NO: 63) 
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7 . MnSOD 




Primer setl T-26C wt-lead 


5' age tgg etc egg Ttt tgg ggt ate t 
(SEQ ID NO: 64) 


Primer setl T-26Cwt lag 


5' aga tac ccc aaa Ace gga gec age t 
(SEQ ID NO: 65) 


Primer setl T-26C polymorph -lead 


5' age tgg etc egg Ctt tgg ggt ate t 
(SEQ ID NO: 66) 


Primer setl T-26C polymorph - lag 


5' aga tac ccc aaa Gee gga gee age t 
(SEQ ID NO: 67) 






Primer set2 T175C wt-lead 


5' tta cag ccc aga Tag etc ttc age c 
(SEQ ID NO: 68) 


Primer set2 T175C wt-lag 


5' ggc tga aga get Ate tgg get gta a 
(SEQ ID NO: 69) 


Primer set2 T175C polymorph - 
lead 


5' tta cag ccc aga Cag etc ttc age c 
(SEQ ID NO: 70) 


Primer set2 T175C polymorph - lag 


5' ggc tga aga get Gtc tgg get gta a 
(SEQ ID NO: 71) 






8 . MTHFR 




Primer setl C677T wt - lead 


5' tgt ctg egg gag Ccg att tea tea t 
(SEQ ID NO: 72) 


Primer setl C677T wt- lag 


5' atg atg aaa teg Get ccc gca gac a 
(SEQ ID NO: 73) 


Primer setl C677T polymorph - 
lead 


5' tgt ctg egg gag Teg att tea tea t 
(SEQ ID NO: 74) 


Primer setl C677T polymorph- lag 


5' atg atg aaa teg Act ccc gca gac a 
(SEQ ID NO: 75) 






Primer set2 A1298C wt-lead 


5' tga cca gtg aag Aaa gtg tct ttg a 
(SEQ ID NO: 7 6) 


Primer set2 A1298C wt-lag 


5' tea aag aca ctt Tct tea ctg gtc a 
(SEQ ID NO: 77) 


Primer set2 A1298C polymorph-lead 


5' tga cca gtg aag Caa gtg tct ttg a 
( SEQ I D NO : 7 8) 


Primer set2 A1298C polymorph-lag 


5' tea aag aca ctt Get tea ctg gtc a 
(SEQ ID NO: 79) 






9. ALDH2 
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Primer setl wt-lead 


5' cag gca tac act Gaa gtg aaa act g 
(SEQ ID NO: 80) 


Primer set 1 wt-lag 


5' cag ttt tea ctt Cag tgt atg cct g 
(SEQ ID NO: 81) 


Primer setl polymorph-lead 


5' cag gca tac act Aaa gtg aaa act g 
(SEQ ID NO' 82) 


Primer set 1 polymorph-lag 


5' cag ttt tea ctt Tag tgt atg cct g 
(SEQ ID NO: 83) 










Primer set 1 -lead 


5' tgc ate tct gec tta cag ate atg t 
(SEQ ID NO: 84) 


Primer setl-lag 


5' aga tga tct gta agg cag aga tgc a 
(SEQ ID NO: 85) 



Advice decision tree 

The results of genetic polymorphism analysis may be used to 
correlate the genetic profile of the donor of the sample with 
disease susceptibility using the first dataset, which 
provides details of the relative disease susceptibility 
associated with particular polymorphisms and their 
interactions. The risk factors identified using dataset 1 
can then be matched with dietary and other lifestyle 
recommendations from dataset 2 to produce a lifestyle advice 
plan individualised to the genetic profile of the donor of 
the sample. Examples of datasets 1 and 2 which may be used to 
generate such advice is illustrated in Figure 1. 

To enable appropriate advice to be tailored to particular 
susceptibilities, a ranking system is preferably used to 
provide an indication of the degree of susceptibility of a 
specific polymorph to risk of cancer (s) and/or other 
conditions. The ranking system may be designed to take into 
account of homozygous or heterozygous alleles in the client's 
sample, i.e. the same or different alleles being present in 
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diploid nucleus. Five categories which may be used are 
summarised below: 

(i) Reduced susceptibility: where an allele has been 
shown to reduce susceptibility. 

(ii) Normal susceptibility: where allele has been shown to 
have a normal susceptibility of risk to cancer (s) or 
disease. This is generally the homozygous wild type 
allele or a polymorphism that has been shown to have 
similar function. 

(iii) Moderate susceptibility: where a heterozygous 
genotype is present that contains the wild type of 
the allele (i.e. normal susceptibility) and an allele 
of the polymorphism known to give rise to higher 
susceptibility to specific cancer (s) or disease. 

(iv) High susceptibility: where a homozygous genotype that 
contains the polymorphism is present with a higher 
risk of cancer susceptibility. 

(v) Higher susceptibility: where a higher susceptibility 
has been observed for specific cancer (s) or disease 
due to the combined effects of two or more different 
gene targets. 

Using dataset 1, a susceptibility may be assigned to each 
polymorphism identified and, from dataset 2, a lifestyle 
recommendation corresponding to each susceptibility 
identified may be assigned. For example, if an individual is 
found to have the NAT1*10 polymorphism, the decision tree may 
indicate that the there is an enhanced susceptibility of 
colonic cancer. Recommendations appropriate to minimising 
the risk of colonic cancer are then generated. For example, 
the recommendations may be to avoid particular foods 
associated with increased risk and to increase consumption of 
other foods associated with a protective effect against such 
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cancers. The totality of recommendations may be combined to 
generate a lifestyle advice plan individualised to the donor 
of the sample. The decision tree is preferably arranged to 
recognise particular combinations of polymorphisms and/or 
susceptibilities which interact either positively to produce 
a susceptibility greater than would be expected from the risk 
factors associated with each individually, and/or, which 
interact negatively to reduce the susceptibility associated 
with each individually. Where such combinations are 
identified, the advice generated can be tailored accordingly. 
For example, the combination of NAT2*4 and NAT1*10 
polymorphisms have been linked to increased cancer risk 
(Bell, 1995) . Therefore, when such a combination of 
polymorphisms is identified from a subject's DNA, the 
associated very high susceptibility to cancer is assigned and 
the advice tailored to emphasise the need to reduce 
consumption of xenobiotics, e.g. by reducing or eliminating 
consumption of char-grilled foodstuffs. 

In generating the advice, other factors such as information 
concerning the sex and health of the individual and /or of 
the individual's family, age, alcohol consumption, and 
existing diet may be used in the determination of appropriate 
lifestyle recommendations. 



Experimental 

Example 1 Preparation of DNA Sample 

DNA is prepared from a buccal cell sample on a brush using a 
Qiagen QIAamp kit according to the manufacturer's 
instructions (Qiagen, Crawley, UK) . Briefly, the brush is cut 
in half and one half stored at room temperature in a sealed 
tube in case retesting is required. The other half of the 
brush is placed in a microcentrifuge tube. 400ul PBS is added 
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and the brush allowed to rehydrate for 45 minutes at room 
temperature. Quiagen lysis buffer and Proteinase K is then 
added, the contents are mixed, and allowed to incubate at 
56 C for 15 minutes to lyse the cells. Ethanol is added and 
5 the lysate transferred to a QIAamp spin column from which DNA 
is eluted after several washings. 

Example 2 Quantification of DNA 

In order to check that sufficient DNA has been isolated, a 
10 quantification step is carried out using the PicoGreen dsDNA 
Quantification kit (Molecular Probes, Eugene, Oregon, USA) . 

Briefly, client DNA samples are prepared by transferring a 10 
ul aliquot into a microcentrifuge tube with 90ulTE. 100 ul 

15 of the working PicoGreen dsDNA quantification reagent is 

added, mixed well, and transferred into a black 96 well plate 
with flat well bottoms. The plate is then incubated for 5 
minutes in the dark before a fluorescent reading is taken. 
The quantity of DNA present in the clients' samples is 

20 determined by extrapolating from a calibration plot prepared 
using DNA standards. 

A quantity of DNA in the range of 5-50ng total is used in the 
subsequent PCR step. Remaining client DNA sample is stored 
25 at -20°C for retesting if required. 

Example 3 Taqman® Assay to Identify the MTHFR A12 98C 
polymo rphi sm 

30 The modified reaction mixture contains Taq polymerase (1.25 

units/yl), optimised PCR buffer, dNTP (200uM each), 2mM MgCl 2 
and primer pairs SEQ ID NO: 160 and 161 and polymorphism 
probe SEQ ID NO: 200. 
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The reaction mixture is initially incubated for 10 minutes at 
50°C, then 5 minutes at 95°C, followed by 40 cycles of 1 
minute of annealing at between 55°C and 60°C and 30 seconds 
of denaturation at 95°C. Both during the cycles and at the 
5 end of the run, fluorescence of the released reporter 
molecules of the probe is measured by an integral CCD 
detection system of the AB7700 thermocycler . The presence of 
a fluorescent signal which increases in magnitude through the 
course of the run indicates a positive result. 

10 

The assay is then repeated with the same primer pair and wt 
probe SEQ ID NO: 199. If the sample is homozygous for the 
polymorphism, no fluorescence signal is seen with the wt 
probe. However, if the sample is heterozygous for the 

15 polymorphism, a fluorescence signal is also seen with the wt 
probe. If single reporter results from homozygous wt, 
homozygous polymorphic and heterozygous polymorphic samples 
are plotted are plotted on an X/Y axis, the homozygous 
alleles will cluster at opposite ends of the axes relative to 

20 each reporter, and the heterozygous alleles will cluster at a 
midway region. 

Example 4 DNA Array method for identifying polymorphisms for 
Identifying multiple polymorphisms 

25 

a) PCR amplification 

The PCR reaction mix contains Tag polymerase (1.25 
units/reaction), optimised PCR buffer, dNTP' s (200uM each) 
30 and MgCl 2 at an appropriate concentration of between 1 and 4 
mM, and 40 praol of each primer (SEQ ID NOS : 1-8, 17-63) for 
amplification of seven fragments and the sample DNA. 



The reaction mixture is initially incubated at 95°C for 1 
minute, and then subjected to 45 cycles of PCR in a MWG 
TC9600 thermocycler (MWG-Biotech-AG Ltd., Milton Keynes, UK) 
as follows: 

- annealing 50°C, 1 minute 

- polymerisation 73°C, 1 minute 

- denaturation 95°C, 30 seconds. 

After a further annealing step at 50°C, 1 minute, there is a 
final polymerisation step at 73°C for 7 minutes. 

(Instead of the MWG TC9600 thermocycler, other thermocyclers , 
such as the Applied Biosystems 9700 thermocycler (Applied 
Biosystems, Warrington, UK), may be used. 

After amplification of the target genes, generation of 
product is checked by electrophoresis separation using 2% 
agarose gel, or a 3.5% NuSieve agarose gel. 

The PCR mplification products are then purified using the 
Qiagen QIAquick PCR Purification Kit (Qiagen, Crawley, UK) to 
remove dNTPs, primers, and enzyme from the PCR product. The 
PCR product is layered onto a QIAquick spin column, a vacuum 
applied to separate the PCR product from the other reaction 
products and the DNA eluted in buffer. 

b) RNA transcription and Fluorescent Labelling of PCR 
products 

The DNA is then transcribed into RNA using T3 and T7 RNA 
polymerases together with fluorescent ly labelled UTP for 
incorporation into the growing chain of RNA. The reaction 
mixture comprises: 

20ul 5X reaction buffer; 500uM ATP, CTP, GTP, fluorescent UTP 
(Amersham Ltd, UK) ; DEPC treated dH 2 0; 1 unit T3 RNA 
polymerase or 1 unit T7 RNA polymerase (Promega Ltd., 



Southampton, UK) ; 1 unit Rnasin ribonuclaese inhibitor and 
DNA from PCR (1/3 of total, lOul in dH 2 0) . 

The mixture is incubated at 37°C for 1 hour. The mixture is 
then treated with DNAse to remove DNA so that only newly 
synthesised fluorescent RNA is left. The RNA is then 
precipitated, microcentrif uged and resuspended in buffer for 
hybridisation on the array. 

c) Polymorphism Analysis 

The sample amplified fragments are then tested using a DNA 
microarray 

The DNA microarray used comprises oligonucleotides SEQ ID 
NOs: 1-85. These oligonucleotides are applied by a robot onto 
a glass slide and immobilised. The f luorescently labelled 
amplified DNA is introduced onto the DNA microarray and a 
hybridisation reaction conducted to bind any complementary 
sequences in the sample, allowing unbound material to be 
washed away. The presence of bound samples is detected using 
a scanner. The absence of a fluorescent signal for a specific 
oligonucleotide probe indicates that the client does not have 
the corresponding polymorphism. 

Example 5 DNA Array method for identifying G560A polymorphism 

The PCR reaction mix contains Taq polymerase (1.25 
units/reaction), optimised PCR buffer, dNTP' s (200uM each) 
and MgCl 2 at an appropriate concentration of between 1 and 4 
mM, and 40 pmol of each primer (SEQ ID NOs: 88,89) for 
amplification of the fragment. The methods used is the same 
as detailed in Example 4, with the array comprising 
oligonucleotides SEQ ID NO: 17, 18, 19 and 20. 
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The presence of bound samples is detected using a scanner as 
described above. A highly fluorescent spot is detected at the 
positions corresponding to the oligonucleotides SEQ ID NO: 19 
5 and 20. No signal is seen at the spots corresponding to SEQ 
ID NO: 17 and 18, demonstrating that the sample is not 
heterozygous for the wt allele. 

Example 6 Generation of Report 

10 

The results of the microarray or Taqman® analysis are input 
into a computer comprising a first dataset correlating the 
presence of individual alleles with a risk factor and a 
second dataset correlating risk factors with lifestyle 
15 advice. A report is generated identifying the presence of 
particular polymorphisms and providing lifestyle 
recommendations based on the identified polymorphisms. An 
example of such a decision process is shown in Figure 2. 

20 A sample of DNA is screened and the alleles identified input 
to a dataprocessor as Dataset 3. Each allele is matched to 
lifestyle risk factor from dataset 1, e.g. high 
susceptibility to colon cancer due to the presence of the 
NAT1*10 allele and the absence of the GSTM1 allele. The 

25 identified risk factor is then matched with one or more 

lifestyle recommendations from dataset 2, for example "avoid 
red meat, chargrilled food, smoked meats and fish; stop 
smoking immediately" (in order to avoid production of 
potentially toxic byproducts by Phase 1 enzymes with 

30 increased activity) and "increase consumption of vegetables 
of the allium family e.g. onions and garlic, and the 
brassaicae family e.g. broccoli" (in order to increase the 
activity of Phase 11 enzymes present, such as GSTPl and GSTTl 
and others, in order to increase the excretion of toxic 

35 byproducts of Phase 1 metabolism) . This is then checked 
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against other factors input into the dataprocessor, e.g. age, 
sex and existing diet to modify the recommendation 
accordingly before generating the final recommendation 
appropriate to the allele. The lifestyle recommendations are 
5 then assembled to generate a comprehensive personalised 
lifestyle advice plan. 
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